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1.  Accomplishments 
Lh  Summary  of  accomplishments 

Tiie  mission  of  the  Defense  Threat  Reduction  Agency  requires  the  quantitative  study  and  accu¬ 
rate  prediction  for  complex  multiphysics  systems  that  couple  together  physical  processes  spanning 
wide  range  of  scales  in  behavior.  Treatment  of  such  systems  depends  on  accurate  numerical  sim¬ 
ulation  of  mathematical  models  expressed  as  systems  of  partial  differential  equations  posed  on 
domains  with  complicated  geometry.  Prediction  of  the  behavior  involves  treating  the  propagation 
of  stochastic  uncertainty  through  the  mathematical  models  and  solving  inverse  problems  for  de¬ 
termining  parameters  based  on  observations  on  model  output.  Quantifying  the  accuracy  of  such 
computations  requires  accurate  estimation  of  the  numerical  error  in  quantities  of  interest  com¬ 
puted  from  numerical  solutions  that  take  into  account  all  sources  of  error,  e.g.  from  discretization, 
representation  of  geometry,  finite  sampling. 

This  project  focuses  on  development  of  mathematical  tools  for  dealing  with  these  problems  in 
the  context  of  multiphysics  models  of  interest  using  relevant  numerical  methods  to  the  mission  of 
the  DTRA.  The  main  approach  is  a  posteriori  error  analysis  based  on  computable  residuals,  solu¬ 
tion  of  adjoint  problems,  and  variational  analysis.  This  approach  estimates  the  error  in  specified 
quantities  of  interest.  Computable  residuals  involving  the  approximate  solution  are  used  to  quan¬ 
tify  the  size  of  various  discretization  errors  while  the  solution  of  adjoint  equations  (generalized 
Green’s  functions)  are  used  to  quantify  the  effects  of  stability  in  producing  errors.  Much  of  the 
project  dealt  with  dealing  the  significant  mathematical  issues  that  arise  when  numerically  solv¬ 
ing  complex  multiphysics  models.  Practical  computational  constraints  requires  the  use  of  a  wide 
variety  of  discretization  approaches,  e.g.  operator  decomposition  and  splitting,  explicit  time  inte¬ 
gration,  iterative  solution  methods  with  few  iterations,  finite  volume  and  specialized  finite  differ¬ 
ence  methods.  The  introduction  of  such  techniques  complicates  both  the  identification  of  suitable 
residuals  and  definition  of  suitable  adjoint  problems.  The  project  also  dealt  with  issues  arising  in 
“multi-discretization”  approaches,  when  various  components  of  a  coupled  system  are  solved  with 
different  numerical  methods  and  numerical  grids.  Another  focus  was  the  treatment  of  problems 
posed  on  complex  domains,  e.g.  on  manifold  surfaces  in  space  and/or  on  domains  with  complex 
boundaries.  In  this  case,  the  goal  was  to  treat  the  effects  of  inaccuracies  and/or  uncertainty  in  the 
representation  of  the  domain  geometry.  Finally,  we  also  establshed  several  rigorous  convergence 
results  for  a  class  of  goal-oriented  adaptive  methods  that  are  designed  to  drivdriving  the  error  in  a 
specific  quantity  of  interest  below  a  given  tolerance. 

Along  with  theoretical  development,  the  project  studied  the  practical  implementation  of  a  pos¬ 
teriori  error  estimates  for  complex  physics,  including  high  performance  issues.  The  project  also 
addressed  the  question  of  efficient  computation.  The  availability  of  accurate  error  estimates  raises 
the  ability  to  develop  efficient  adaptive  error  control  algorithms  in  which  various  discretization 
parameters  are  adjusted  based  on  relative  contributions  to  the  overall  error  in  order  to  achieve  a 
desinsd  accuracy  with  minimal  computational  work.  In  another  direct,  the  project  expanded  a  pos¬ 
teriori  error  estimates  for  computed  distributions  and  probabilities  arising  in  computational  sensi¬ 
tivity  analysis  and  developed  generalized  adaptive  algorithms  that  allow  for  balancing  all  sources 
of  enor  and  uncertainty  affecting  the  analysis. 

The  project  RI.s’  undertook  a  significant  degree  of  interdisciplinary  interaction  during  the 
projects  in  order  to  insure  that  project  accomplishments  would  have  impact  in  science  and  en¬ 
gineering. 

1.2.  Detailed  descriptions  of  specific  accomplishments 

In  this  section,  we  describe  specific  technics  accomplishments  of  the  project. 

A  posteriori  error  analysis  for  a  transient  conjugate  heat  transfer 

We  analyzed  the  accuracy  of  an  operator  decomposition  finite  element  method  for  a  transient  con¬ 
jugate  heat  transfer  problem  consisting  of  two  materials  coupled  through  a  common  boundary.  We 
derive  accurate  a  posteriori  error  estimates  that  account  for  the  transfer  of  error  between  compo- 
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nents  of  the  operator  decomposition  method  as  well  as  the  errors  in  solving  the  iterative  system. 
We  address  a  loss  of  order  of  convergence  that  results  from  the  decomposition,  and  show  that  the 
order  of  convergence  is  limited  by  the  accuracy  of  the  transferred  gradient  information.  We  ex¬ 
tend  a  boundary  flux  recovery  method  to  transient  problems  and  use  it  to  regain  the  expected  order 
of  accuracy  in  an  efficient  manner.  In  addition,  we  use  the  a  posteriori  error  estimates  to  adap¬ 
tively  compute  the  recovered  boundary  flux  only  within  the  domain  of  dependence  for  a  quantity 
of  interest. 

A  posteriori  error  estimation  and  adaptive  mesh  refinement  for  a  multiscale  operator  decomposi¬ 
tion  approach  to  fluid-solid  heat  transfer 

We  analyze  a  multiscale  operator  decomposition  finite  element  method  for  a  conjugate  heat  transfer 
problem  consisting  of  a  fluid  and  a  solid  coupled  through  a  common  boundary.  We  derive  accurate 
a  posteriori  error  estimates  that  account  for  all  sources  of  error,  and  in  particular  the  transfer  of  error 
between  fluid  and  solid  domains.  We  use  these  estimates  to  guide  adaptive  mesh  refinement.  In 
addition,  we  provide  compelling  numerical  evidence  that  the  order  of  convergence  of  the  operator 
decomposition  method  is  limited  by  the  accuracy  of  the  transferred  gradient  information,  and  adapt 
a  so-called  boundary  flux  recovery  method  developed  for  elliptic  problems  in  order  to  regain  the 
optimal  order  of  accuracy  in  an  efficient  manner.  In  an  appendix,  we  provide  an  argument  that 
explains  the  numerical  results  provided  sufficient  smoothness  is  assumed. 

Nonparametric  density  estimation  for  randomly  perturbed  elliptic  problems 

We  study  the  nonparametric  density  estimation  problem  for  a  quantity  of  interest  computed  from 
solutions  of  an  elliptic  partial  differential  equation  with  randomly  perturbed  coefficients  and  data. 
We  derive  an  efficient  method  for  computing  samples  and  generating  an  approximate  probability 
distribution  based  on  Lion’s  domain  decomposition  method  and  the  Neurriann  series.  We  then 
derive  an  a  posteriori  error  estimate  for  the  computed  probability  distribution  reflecting  all  sources 
of  deterministic  and  statistical  errors.  Finally,  we  develop  an  adaptive  error  control  algorithm 
based  on  the  a  posteriori  estimate,  we  extend  the  analysis  to  include  a  “modeling  error”  term  that 
accounts  for  the  effects  of  the  resolution  of  the  statistical  description  of  the  random  variation  and 
modify  the  adaptive  algorithm  to  adapt  the  resolution  of  the  statistical  description.  We  also  prove 
some  related  convergence  results. 

A  posteriori  error  analysis  for  cell-centered  finite  volume  methods  for  semilinear  elliptic  problems 

We  conduct  a  goal-oriented  a  posteriori  analysis  for  the  error  in  a  quantity  of  interest  computed 
from  a  cell-centered  finite  volume  scheme  for  a  semilinear  elliptic  problem.  To  carry  out  the 
analysis,  we  use  an  equivalence  between  the  cell-centered  finite  volume  scheme  and  a  mixed  finite 
element  method  with  special  choice  of  quadrature. 

Blockwise  adaptivity  for  time  dependent  problems  based  on  coarse  scale  adjoint  solutions 

We  describe  and  test  an  adaptive  algorithm  for  evolution  problems  that  employs  a  sequence  of 
“blocks”  consisting  of  fixed,  though  non-uniform,  space  meshes.  This  approach  offers  the  advan- 
tage.s  of  adaptive  mesh  refinement  but  with  reduced  overhead  costs  associated  with  load  balancing, 
re-meshing,  matrix  reassembly,  and  the  solution  of  adjoint  problems  used  to  estimate  discretiza¬ 
tion  error  and  the  effects  of  mesh  changes.  We  describe  several  strategies  to  determine  appropriate 
block  discretizations  from  coarse  scale  solution  information  using  adjoint-based  a  posteriori  error 
estimates  and  demonstrate  the  behavior  of  the  algorithms  in  a  set  of  examples. 

Conservative  discretization  and  a  posteriori  error  analysis  for  a  cut  cell  diffusion  problems  with 
complex  geometry 

We  study  the  solution  of  a  diffusive  process  in  a  domain  where  the  diffusion  coefficient  changes 
discontinuously  across  a  curved  interface.  We  consider  discretizations  that  use  regularly-shaped 
meshes,  so  that  the  interface  “cuts”  through  the  cells  (elements  or  volumes)  without  respecting 
the  regular  geometry  of  the  mesh.  Consequently,  the  discontinuity  in  the  diffusion  coefficients  has 
a  strong  impact  on  the  accuracy  and  convergence  of  the  numerical  method.  This  motivates  the 
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derivation  of  computational  error  estimates  that  yield  accurate  estimates  for  specified  quantities  of 
interest.  For  this  purpose,  we  adapt  the  well-known  adjoint  based  a  posteriori  error  analysis  tech¬ 
nique  used  for  finite  element  methods.  In  order  to  employ  this  method,  we  describe  a  systematic 
approach  to  discretizing  a  cut-cell  problem  that  handles  complex  geometry  in  the  interface  in  a 
natural  fashion  yet  reduces  to  the  well-known  Ghost  Fluid  Method  in  simple  cases.  We  test  the 
accuracy  of  the  estimates  in  a  series  of  examples. 

A  measure-theoretic  computational  method  for  inverse  sensitivity  problems 

We  consider  the  inverse  sensitivity  analysis  problem  of  quantifying  the  uncertainty  of  inputs  to  a 
deterministic  map  given  specified  uncertainty  in  a  linear  functional  of  the  output  of  the  map.  This 
is  a  version  of  the  model  calibration  or  parameter  estimation  problem  for  a  deterministic  map.  We 
assume  that  the  uncertainty  in  the  quantity  of  interest  is  represented  by  a  random  variable  with 
a  given  distribution  and  we  use  the  Law  of  Total  Probability  to  express  the  inverse  problem  for 
the  corresponding  probability  measure  on  the  input  space.  Assuming  that  the  map  from  the  input 
space  to  the  quantity  of  interest  is  smooth,  we  solve  the  generally  ill-posed  inverse  problem  by 
using  the  Implicit  Function  Theorem  to  derive  a  method  for  approximating  the  set-valued  inverse 
that  provides  an  approximate  quotient  space  representation  of  the  input  space.  We  then  derive  an 
efficient  computational  approach  to  compute  a  measure  theoretic  approximation  of  the  probability 
measure  on  the  input  space  imparted  by  the  approximate  set-valued  inverse  that  solves  the  inverse 
problem.  We  also  treat  the  situation  in  which  the  output  of  the  map  is  determined  implicitly  and 
is  difficult  and/or  expensive  to  evaluate,  e.g  requiring  the  solution  of  a  differential  equation,  and 
hence  the  output  of  the  map  is  approximated  numerically.  The  main  goal  is  an  a  posteriori  error 
estimate  that  can  be  used  to  evaluate  the  accuracy  of  the  computed  distribution  solving  the  inverse 
problem  taking  into  account  all  sources  of  statistical  and  numerical  deterministic  errors.  We  present 
a  general  analysis  for  the  method  and  then  apply  the  analysis  to  the  case  of  a  map  determined  by 
the  solution  of  an  initial  value  problem. 

A  posteriori  analysis  of  multirate  numerical  methods  for  multiscale  ordinary  differential  equations 

We  analyze  a  multirate  time  integration  method  for  systems  of  ordinary  differential  equations  that 
present  significantly  different  scales  within  the  components  of  the  model.  We  interpret  the  mul¬ 
tirate  method  as  a  multiscale  operator  decomposition  method  and  use  this  formulation  to  conduct 
both  an  a  priori  error  analysis  and  a  hybrid  a  priori  -  a  posteriori  error  analysis.  The  hybrid  analy¬ 
sis  has  the  form  of  a  computable  a  posteriori  leading  order  expression  and  a  provably-higher  order 
a  priori  expression.  Both  analyses  distinguish  the  effects  of  the  discretization  of  each  component 
from  the  effects  of  multirate  solution.  The  effects  on  stability  arising  from  the  multirate  solution 
are  reflected  in  perturbations  to  certain  associated  adjoint  operators. 

Convergence  theory  for  goal-oriented  adaptive  methods 

In  the  first  of  the  convergence  theory  subprojects  of  the  DTRA  project.  We  developed  a  new  con¬ 
vergence  theory  for  a  general  class  of  adaptive  approximation  algorithms  for  nonlinear  operator 
equations,  and  then  used  the  theory  to  obtain  convergence,  contraction,  and  optimality  results  for 
practical  adaptive  finite  element  methods  (AFEM)  applied  to  several  classes  of  nonlinear  elliptic 
equations  and  systems  of  elliptic  equations.  The  results  can  be  viewed  as  extending  the  recent  con¬ 
vergence  results  for  linear  problems  of  Morin,  Siebert  and  Veeser,  and  of  Nochetto  et.  al  to  more 
general  nonlinear  problems  (with  G.  Tsogtgerel  and  Y.  Zhu).  We  also  develop  new  mathematical 
results  for  hierarchical  error  indicators  to  drive  AFEM  algorithms,  and  establish  condition  number 
estimates  for  appropriate  preconditioners  (with  J.  Ovall  and  R.  Szypowski).  We  have  further  ex¬ 
tended  these  results  to  the  class  of  adaptive  methods  that  were  the  target  of  this  DTRA  research 
probejct:  goal-oriented  adaptive  methods  that  are  designed  to  drive  the  error  in  a  quantity  of  inter¬ 
est  below  a  given  tolerance.  In  2009,  Mommer  and  Stevenson  developed  a  goal-oriented  adaptive 
method  for  the  Poisson  equation,  together  with  rigorous  convergence  and  complexity  results  for 
their  method,  establishing  what  was  apparently  the  first  convergence  result  for  a  goal-oriented 
adaptive  method.  We  have  now  extended  the  results  of  Mommer  and  Stevenson  to  goal-oriented 
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adaptive  methods  for  general  linear  convection-diffusion  elliptic  problems  (with  S.  Pollock).  In  a 
second  manuscript,  these  results  were  further  extended  to  a  large  class  of  scalar  nonlinear  prob¬ 
lems  (with  S.  Pollock  and  Y.  Zhu).  All  three  articles  have  now  been  posted  on  arXiv,  submitted 
for  publication,  and  are  currently  in  review.  All  of  the  techniques  are  demonstrated  for  practical 
problems  of  interest  using  the  FETK  software  (see  below). 

Analysis  of  multiphysics  problems  with  complex  domains 

We  analyzed  a  large  class  of  regularized  Navier-Stokes  and  Magnetohydrodynamics  (MHD)  mod¬ 
els  in  three-dimensional  spatial  domains,  a  class  which  includes  the  Navier-Stokes  equations,  the 
Navier-Stokes-alpha  model,  the  Leray-alpha  model,  the  Modified  Leray-alpha  model,  the  Simpli¬ 
fied  Bardina  model,  the  Navier-Stokes- Voight  model,  the  Navier-Stokes-alpha-like  models,  and 
certain  MHD  models,  in  addition  to  representing  a  larger  3-parameter  family  of  models  not  previ¬ 
ously  analyzed.  We  recovered  a  number  of  known  results  for  established  models,  but  also  obtained 
new  results  for  all  models  in  this  general  family,  including  existence,  regularity,  uniqueness,  sta¬ 
bility,  attractor  existence  and  dimension,  and  existence  of  determining  operators.  (J.  Nonlinear 
Science  2009,  with  E.  Lunasin  and  G.  Tsogtgerel.) 

We  then  develop  and  analyze  numerical  methods  for  approximation  of  stationary  and  evolution 
problems  on  surfaces,  including  coupled  elliptic-parabolic  systems.  A  major  theoretical  break- 
trough  was  showing  how  the  recent  finite  element  error  estimates  of  Demlow  and  Dziuk  can  be 
recovered  from  a  more  general  approach  involving  the  analysis  of  variational  crimes  in  Hilbert 
complexes,  generalizing  their  results  for  surface  finite  elements  to  arbitrary  spatial  dimension  and 
to  applications  involving  higher-dimensional  differential  forms  and  both  linear  and  nonlinear  equa¬ 
tions.  This  generalization  was  made  possible  through  the  use  and  extension  of  finite  element  exte¬ 
rior  calculus  (FEEC).  (Found.  Comput.  Math.  2012,  with  A.  Stem.)  We  have  now  extended  this 
work  in  FEEC  in  the  direction  of  time-dependent  problems;  we  completed  and  submitted  a  new 
manuscript  in  2012  that  extends  these  results  on  surface  finite  element  methods  to  scalar  parabolic 
and  hyperbolic  problems,  including  again  nonlinear  problems  (with  A.  Gillette).  We  also  give  an 
analysis  of  the  singularities  in  a  fundamentally  important  model  in  biochemistry,  and  develop  a 
number  of  AFEM-based  numerical  techniques  for  treating  these  degenerate  features  in  a  provably 
high-fidelity  way  (Comm.  Comput.  Phys.  2012,  with  J.  McCammon,  Y.  Zhou,  Y.  Zhu,  Z.  Yu). 

In  addition,  we  have  developed  and  implemented  goal-oriented,  adjoint-based,  a  posteriori 
error  estimates  for  elliptic  problems  on  smooth  manifolds.  In  particular,  the  estimates  take  into 
account  the  effects  of  domain  curvature  on  accuracy.  We  also  considered  the  problem  of  small 
random  perturbations  to  the  manifold,  pointing  the  way  to  treat  problems  in  which  the  domain 
is  determined  experimentally  or  by  measurement.  This  work  is  nearing  completion  and  will  be 
submitted  in  Summer  2012  (with  W.  Newton) 

Analysis  of  elliptic  problems  on  domains  with  randomly  perturbed  boundaries 

We  developed  a  systematic  approach  to  solve  elliptic  problems  on  domains  that  have  randomly 
perturbed  boundaries,  after  first  classifying  such  problems  into  several  different  classes.  The  results 
are  particularly  relevant  to  situations  in  which  the  boundaries  are  obtained  through  measurement 
or  are  subject  to  error.  The  approach  avoids  the  need  to  remesh  each  new  domain  in  a  random 
sampling  Monte  Carlo  solution.  Moreover,  we  derive  a  posteriori  error  estimates  that  indicate  how 
random  perturbations  in  the  boundary  affect  the  accuracy  of  computed  solutions. 

A  posteriori  error  analysis  of  explicit,  IMEX,  and  truncated  Picard  iteration  time  integration  meth¬ 
ods 

Explicit,  Implicit/Explicit  (IMEX),  and  truncated  Picard  iteration  time  integration  methods  are 
widely  employed  to  solve  multiphysics  applications  in  defense  and  department  of  energy  enter¬ 
prises,  e.g.  such  as  reacting  flows.  Such  methods  requires  significant  alterations  for  a  posteriori 
error  analysis  in  order  to  describe  the  effects  of  these  approaches  on  both  stability  and  accuracy. 
Therefore,  last  year  we  undertook  the  systematic  study  of  a  posteriori  error  analysis  for  explicit, 
truncated  Picard  iteration,  and  implicit/explicit  (IMEX)  time  integration  methods.  For  explicit 
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methods,  we  introduce  special  projection  operators  into  the  standard  finite  element  formulation  for 
evolution  problems.  These  projection  operators  are  (1)  a  truncated  Taylor  expansion  computed  at 
a  past  time  node  and  (2)  extrapolation  from  a  interpolatory  polynomial  using  values  at  a  collection 
of  previous  nodes.  We  then  alter  the  a  posteriori  error  analysis  to  include  terms  that  measure  the  ef¬ 
fects  of  these  projections,  yielding  distinct  “explicit”  time  integration  terms  in  the  a  posteriori  error 
analysis.  We  recently  have  extended  this  approach  to  treat  IMEX  methods.  To  analyze  truncated 
Picard  iteration  methods,  we  exploit  an  old  result  of  H.  Keller  and  J.  Keller  for  the  “matricant”, 
which  is  the  exponential  form  of  the  solution  operator  of  a  linear  non-autonomous  evolution  prob¬ 
lem.  irhis  provides  a  way  to  define  the  adjoint  for  a  solution  obtained  by  truncated  Picard  iteration, 
which  we  then  use  in  the  a  posteriori  error  analysis.  We  have  also  extended  this  analysis  to  implicit 
methcids  that  employ  Jacobi  iteration  to  solve  the  systems  at  each  step. 

Coupled  parabolic-elliptic  systems 

Estep  and  Holst  collaborated  on  the  development  methods  and  a  posteriori  error  analysis  for  cou¬ 
pled  parabolic-elliptic  systems  of  equations.  The  main  application  is  on  modeling  of  black  holes. 
A  nev/  development  in  the  Holst  group  has  been  the  extension  of  their  recent  work  on  finite  ele¬ 
ment  exterior  calculus  to  parabolic  and  hyperbolic  problems  (completed  and  submitted  in  2012), 
which  will  provide  a  very  strong  mathematical  framework  for  the  development  of  methods  and  a 
posteriori  analysis  for  coupled  parabolic-elliptic  problems.  This  extension  to  FEEC  is  now  being 
combined  with  our  recent  work  on  goal-oriented  adaptive  methods  using  a  variational  framework, 
by  which  the  elliptic  component  of  the  system  is  combined  with  implicit  time-stepping  schemes 
to  provide  “constraints”  in  a  Lagrange  multiplier  formulation.  We  are  able  to  show  convergence 
for  the  adaptive  scheme,  generalizing  our  recent  work  on  convergence  theory  for  goal-oriented 
adaptive  methods  (with  S.  Pollock,  Y.  Zhu). 

Coupled  ordinary  differential  equation  -  parabolic  differential  equation 

Estep  and  Hameed  (along  with  collaborators)  derived  and  implemented  a  posteriori  error  estimates 
for  systems  of  evolution  equations  consisting  of  a  reaction-diffusion  problem  posed  on  a  global 
domain  coupled  to  systems  of  ordinary  differential  equations  in  a  collection  of  small  cells  parti¬ 
tioning  the  global  domain.  The  local  cell  problems  model  chemical  reactions  that  determine  the 
local  physical  conditions  driving  the  parabolic  problem.  The  analysis  takes  into  account  the  itera¬ 
tion  error  in  solving  the  coupled  systems. 

New  approaches  to  adaptive  error  control  for  evolution  problems 

Estep  and  Hameed  (along  with  collaborators)  developed  new  adaptive  error  control  algorithms  that 
take  into  account  cancellation  of  errors  to  improve  efficiency.  The  approach  identifies  periods  of 
time  over  which  there  is  significant  cancellation.  Inside  the  regions,  uniform  refinements  are  used 
to  preserve  the  favorable  cancellation,  while  the  time  step  sizes  in  the  various  regions  are  adjusted 
according  to  the  contribution  to  the  overall  error  from  the  regions. 

Implementation  of  theoretical  results 

The  last  major  goal  in  this  project  is  implementation  of  the  theoretical  results  into  the  FETK  code. 
For  this  purpose,  we  recruited  a  full  time  postdoc,  Ryan  Szypowski,  working  at  UCSD  under  the 
supervision  of  co-PI  Michael  Holst  with  responsibility  to  carry  out  the  implementation  and  testing. 
He  is  being  jointly  supervised  by  the  PI  D.  Estep.  This  FETK  deveopment  has  focused  on  providing 
a  robust,  theory-based  convergent  adaptive  finite  element  implementation  for  nonlinear  problems 
which  retains  linear  complexity.  This  has  included  work  on  the  following  specific  components, 
which  have  been  implemented  in  both  the  MATLAB  subset  FETKLab  of  the  2D  code  in  ^TK  as 
well  as  in  the  full  2D/3D  code  in  FETK: 

1.  The  element  marking  strategy  was  updated  to  be  based  on  “Dorfler  Marking”.  Special  care 
was  taken  to  use  a  linear-time  complexity  binning  approach  as  opposed  to  an  actual  sort. 
Only  this  type  of  marking  strategy,  which  is  not  often  used  in  practice  due  to  its  poten- 
,  tial  costs  unless  carefully  implemented,  allows  for  establishing  both  convergence  and  linear 
overall  computational  complexity  of  the  adaptive  algorithms. 
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2.  A  number  of  new  error  estimators  were  added.  They  include: 

(a)  A  hierarchical  error  estimator  based  on  face-bump  functions  which  was  proven  in  our 
recent  publications  to  be  efficient,  reliable,  and  robust.  This  work  included  the  addition 
of  a  new  cubic  bump  finite  element  space,  which  led  to  a  better  understanding  of  how 
we  can  improve  the  finite  element  space  implementation  to  allow  for  future  additions. 

(b)  An  error  estimator  based  on  the  solution  of  a  dual  problem,  which  we  refer  to  as  dual- 
weighted  residual  (DWR).  This  implementation  involved  leveraging  the  work  on  the 
bump-function  library  above,  as  well  as  the  development  of  high-order  quadrature 
rules,  and  the  ability  to  maintain  two  distinct  and  unrelated  adaptive  meshes  during 
a  computation,  with  quantities  being  projected  back  and  forth  between  the  meshes  as 
needed. 

(c)  An  error  estimator  based  on  smoothed  gradients.  This  is  based  on  recent  work  of  R. 
Bank  and  J.  Xu,  collaborators  of  the  Pis. 

3.  W.  Newton,  co-advised  by  Estep  and  Holst,  implemented  the  a  posteriori  error  estimates  that 
account  for  error  in  the  description  of  the  manifold  on  which  the  problem  is  posed  developed 
in  his  thesis. 

4.  A  driver  application  for  solving  nonlinear  problems  using  inexact  Newton  solvers  based  on 
a  multilevel  approach  was  written.  This  has  been  used  for  most  of  the  problems  described 
above. 

5.  Prior  to  2013,  FETK  and  the  FETKLab  MATLAB  subset  of  FETK  were  primarily  based  on 
linear  finite  element  discretizations,  with  enough  partial  support  for  higher-order  elements 
to  allow  for  the  use  of  e.g.  bump  functions  in  error  indicators  and  formulation  of  dual  prob¬ 
lems.  A  general  element  class  was  developed  in  early  2013  to  allow  for  use  of  any  type 
of  Lagrange-type  element  for  either  the  primarl  or  dual  problem.  Both  linear  and  quadratic 
elements  were  then  implemented  and  are  provided  with  the  FETK  code  base  as  element 
examples.  Our  recent  manuscripts  with  new  convergence  results  for  goal-oriented  methods 
contain  a  large  collection  of  numerical  examples  that  now  exploit  this  infrastructure  to  care¬ 
fully  compare  a  number  of  adaptive  methods  based  on  goal  functions  (with  S.  Pollock  and 
Y.  Zhu). 

2,  Training  and  Professional  Development 

The  support  of  this  project  has  partially  contributed  to  the  training  and  professional  devel¬ 
opment  for  three  graduate  students  and  three  postdocs.  This  includes  specialized  research-level 
instruction  and  individual  mentoring  as  well  as  participation  in  large  research  group  activities  di¬ 
rected  by  the  Pis.  Students  and  postdocs  were  encouraged  to  participate  in  professional  meetings 
and  to  interact  with  researchers  in  other  universities  and  in  national  DOE  laboratories  as  appropri¬ 
ate.  Students  and  postdocs  were  trained  to  write  and  prepare  and  deliver  professional  presentations. 

Details  for  the  trainees; 

•  Will  Newton  received  his  Ph.D.  from  CSU  in  2011,  and  then  was  hired  as  a  Research  Scien¬ 
tist  Class  I  in  PI  Estep’s  group.  His  primary  focus  is  a  project  on  multiscale  models  of  new 
nuclear  fuels  supported  by  a  contract  from  Idaho  National  Laboratory.  He  has  continued  to 
work  on  research  related  to  this  project  following  up  on  the  work  in  his  thesis.  Thesis  is  “A 
Posteriori  Error  Estimates  for  the  Poisson  Problem  on  Closed,  Two-Dimensional  Surfaces”, 
available  from  Colorado  State  University  Library. 

•  Nate  Burch  received  his  Ph.D.  from  CSU  in  2011,  and  then  took  a  two  year  postdoc  po¬ 
sition  at  SAMSI  (Statistical  and  Mathematical  Sciences  Institute)  as  part  of  the  Program 
on  Uncertainty  Quantification.  Thesis  is  “Probabilistic  Foundation  of  Nonlocal  Diffusion 
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and  Formulation  and  Analysis  for  Elliptic  Problems  on  Uncertain  Domains”,  available  from 
Colorado  State  University  Library. 

•  The  CSU  postdoc  Jehanzeb  Hameed  is  in  the  second  year  of  his  position  in  PI  Estep’s  group. 
His  primary  focus  is  a  project  on  a  Department  of  Energy  Uncertainty  Quantification  project 
that  is  jointly  conducted  with  Sandia  National  Laboratory.  Part  of  his  research  is  related  to 
the  activities  supported  in  this  project. 

•  Jonny  Serencsa  received  his  Ph.D.  from  UCSD  in  2012,  and  has  been  doing  pre-  and  post¬ 
doctoral  work  at  UC  Davis.  His  doctoral  work  was  jointly  supervised  by  PI  Holst  and  S. 
Shkoller  at  UC  Davis,  and  he  is  currently  working  for  a  startup  company  in  the  Bay  Area. 

•  Ryan  Szypowski  received  his  Ph.D.  from  UCSD  in  2008,  and  remained  at  UCSD  working 
with  Holst  as  a  postdoc  and  then  research  scientist  until  2012.  He  moved  to  a  tenure-track 
position  in  the  Mathematics  Department  at  Cal  Poly  Pomona  in  Fall  2012. 

•  Andrew  Gillette  received  his  Ph.D.  from  UT  Austin  in  2011,  and  joined  Holst’s  group  at 
UCSD  as  a  postdoctoral  fellow  in  Fall  2011.  He  helped  push  forward  both  the  the  project 
involving  Ryan  Szypowski,  and  the  development  of  an  ^EC-based  error  analysis  frame¬ 
work  for  parabolic  and  hyperbolic  problems.  In  Fall  2013,  Andrew  is  starting  a  tenure-track 
faculty  position  in  the  mathematics  department  at  the  University  of  Arizona. 

•  Sara  Pollock  received  her  Ph.D.  from  UCSD  2012,  and  remained  at  UCSD  working  with 
Holst  as  a  postdoc  during  the  2012-2012  academic  year.  In  Fall  2013,  Sara  is  starting  a 
3-year  named  postdoctoral  position  in  the  mathematics  department  at  Texas  A&M. 

3.  Dissemination 

We  have  disseminated  the  research  in  this  project  through  submission  of  peer-reviewed  research 
articles,  presenting  many  invited  talks  at  universities  and  conferences,  and  publishing  software 
developed  in  this  project  for  public  access.  A  summary  of  this  activity  during  this  project: 

•  53  research  articles  related  to  the  project  research  have  appeared  or  are  accepted 

•  19  research  articles  related  to  the  project  research  are  currently  under  review 

•  5  book  and/or  book  chapters  have  appeared  or  are  being  written 

•  60  invited  lectures  at  universities  and  professional  meetings 

Applications  to  multiscale/multiphysics  physical  and  engineering  systems 

In  conjunction  with  collaborators  in  engineering,  chemistry  and  biophysics,  we  have  applied  many 
of  the  algorithms  and  techniques  for  multiphysics  and  multiscale  problems  developed  in  this 
DTRA-supported  research  program.  Our  focus  continues  to  be  on  applications  in  material,  chemi¬ 
cal  and  biological  physics  of  relevance  to  DOD,  DTRA,  and  DOE  missions.  In  addition  to  our  pub¬ 
lications  placed  in  the  mathematics  literature,  we  have  placed  joint  publications  from  these  research 
collaborations  with  physical  scientists  and  engineers  in  a  broad  spectrum  of  leading  scientific  jour¬ 
nals  to  maximize  the  impact  of  our  results,  including:  Physical  Review  Letters,  Physical  Review 
D,  Journal  of  Nonlinear  Science,  Classical  and  Quantum  Gravity,  Journal  of  Chemical  Theory 
and  Computation,  Journal  of  Cell  Science,  Journal  of  Structural  Biology,  Biophysical  Journal, 
PLoS  Computational  Biology,  IMA  Journal  on  Applied  Mathematics,  Computer  Aided  Geomet¬ 
ric  Design,  BIT,  Applied  Numerical  Mathematics,  IEEE  Journal  on  Engineering  in  Medicine  and 
Biology,  IEEE  Transactions  on  Biomedical  Computing,  Frontiers  in  Computational  Physiology 
and  Medicine,  Investigative  Ophthalmology  and  Visual  Science,  Journal  of  Scientific  Computing, 
Journal  of  Applied  Mathematics  and  Computation,  Communications  in  Computational  Physics, 
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Journal  of  Molecular  Graphics  and  Modeling,  Journal  of  Physical  Chemistry  B,  Journal  of  Chem¬ 
ical  Physics,  Communications  in  Mathematical  Physics,  Annals  of  Nuclear  Engineering,  Journal 
of  Computational  Physics,  Acta  Biomaterialia,  Computer  Methods  in  Applied  Mechanics  and  En¬ 
gineering,  Journal  of  Engineering  Mathematics,  and  Foundations  of  Computational  Mathematics. 

4.  Products 

4.1.  Publications,  conference  papers,  and  presentations 

The  following  papers  were  accepted  or  appeared  during  March  27,  2009  -  September  I,  2009 

•  A  posteriori  analysis  and  adaptive  error  control  for  multiscale  operator  decomposition  meth¬ 
ods  for  coupled  elliptic  systems  1:  One  way  coupled  systems,  V.  Carey,  D.  Estep,  and  S. 
Tavener,  SIAM  Journal  on  Numerical  Analysis  47  (2009),  740-761 

•  A  posteriori  error  analysis  for  a  transient  conjugate  heat  transfer  problem,  D.  Estep,  S. 
Tavener,  T  Wildey,  Finite  Elements  in  Analysis  and  Design  45  (2009),  263-271 

•  Nonparametric  density  estimation  for  randomly  perturbed  elliptic  problems  I:  Computa¬ 
tional  methods,  a  posteriori  analysis,  and  adaptive  error  control,  D.  Estep,  A.  Malqvist,  and 
S.  Tavener,  SIAM  Journal  on  Scientific  Computing  31  (2009),  2935-2959 

•  Solving  the  Einstein  constraints  on  multi-block  triangiilations  using  finite  elements,  O.  Ko- 
robkin,  B.  Aksoylu,  M.  Holst,  E.  Pazos,  and  M.  Tiglio,  Class.  Quant.  Grav.  26  (2009),  No. 
14,  145007  (28  pp).  (arXiv:gr-qc/0801. 1823) 

•  An  adaptive  finite  element  method  for  solving  the  exact  Kohn-Sham  equation  of  density  func¬ 
tional  theory,  E.  Bylaska,  M.  Holst,  and  J.  Weare,  Journal  of  Chemical  Theory  and  Compu¬ 
tation,  5  (2009),  pp.  937-948. 

•  Finite  Element  Analysis  of  Drug  Electrostatic  Diffusion:  Inhibition  Rate  Studies  in  N1  Neu¬ 
raminidase,  Y.  Cheng,  M.  Holst,  and  J.A.  McCammon,  Biocomputing  2009:  Proceedings  of 
the  Pacific  Symposium,  R.B.  Altman,  A.K.  Dunker,  L.  Hunter,  T.  Murray,  and  T.E.  Klein, 
eds.,  2009,  pp.  281-292. 

•  Three-dimensional  reconstruction  reveals  new  details  of  membrane  systems  for  calcium  sig¬ 
naling  in  the  heart,  T.  Hayashi,  M.E.  Martone,  Z.  Yu,  A.  Thor,  M.  Doi,  M.  Holst,  M.H. 
Ellisman,  and  M.  Hoshijima,  J.  Cell  Sci.,  Vol.  122  (April,  2009),  No.  7,  pp.  1005-1013. 

•  Rough  Solutions  of  the  Einstein  Constraints  on  closed  manifolds  without  near-CMC  condi¬ 
tions,  M.  Holst,  G.  Nagy,  and  G.  Tsogtgerel,  Comm.  Math.  Phys.,  Vol.  288  (June  2009), 
No.  2,  pp.  547-613.  (arXiv:gr-qc/07 12.0798) 

•  Multi-Scale  Modeling  of  Ventricular  Myocytes:  Contributions  of  structural  and  functional 
heterogeneities  to  excitation-contraction  coupling  in  the  normal  and  failing  rodent  heart,  S. 
Lu,  A.  Michailova,  J.  Saucerman,  Y.  Cheng  Z.  Yu,  T.  Kaiser,  W.  Li,  R.  Bank,  M.  Holst,  A. 
McCammon,  T.  Hayashi,  M.  Hoshijima,  P.  Arzberger,  and  A.  McCulloch,  IEEE  Journal  on 
Engineering  in  Medicine  and  Biology,  Vol.  28  (March-April  2009),  No.  2,  pp.  46-57. 

•  Convergence  and  Optimality  of  Adaptive  Mixed  Finite  Element  Methods,  L.  Chen,  M.  Holst, 
and  J.  Xu,  Math.  Comp.,  Vol.  78  (2009),  No.  265,  pp.  33-53. 
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The  following  papers  were  accepted  or  appeared  during  September  2,  2009  -  September  1,  2010 

•  Nonparametric  density  estimation  for  randomly  perturbed  elliptic  problems  II:  Applications 
and  adaptive  modeling,  D.  Estep,  A.  Malqvist,  S.  Tavener,  International  Journal  for  Numer¬ 
ical  Methods  in  Engineering  80  (2009),  846-867 

•  A  posteriori  error  analysis  of  a  cell-centered  finite  volume  method  for  semilinear  elliptic 
problems,  D.  Estep,  M.  Pemice,  D.  Pham,  S.  Tavener,  H.  Wang,  Journal  of  Computational 
and  Applied  Mathematics  233  (2009),  459  -  472 

•  A  posteriori  error  estimation  and  adaptive  mesh  refinement  for  a  multi-discretization  oper¬ 
ator  decomposition  approach  to  fluid-solid  heat  transfer,  D.  Estep,  S.  Tavener,  T.  Wildey, 
Journal  of  Computational  Physics  229  (2010),  4143  -  4158 

•  Blockwise  adaptivity  for  time  dependent  problems  based  on  coarse  scale  adjoint  solutions, 
V.  Carey,  D.  Estep,  A.  Johansson,  M.  Larson,  and  S.  Tavener,  SIAM  Journal  on  Scientific 
Computing  32  (2010),  2121  -  2145 

•  Numerical  analysis  of  Ca2+  signaling  in  rat  ventricular  myocytes  with  realistic  transverse- 
axial  tubular  geometry  and  inhibited  sarcoplasmic  reticulum,  Y.  Cheng,  Z.  Yu,  M.  Hoshi- 
jima,  M.  Holst,  A.  McCulloch,  and  J.  M.  ad  A.P.  Michailova,  PLoS  Computational  Biology, 
6(2010),  pp.  el000972:l-16. 

•  Poisson-Nernst-Planck  equations  for  simulation  biomolecular  diffusion-reaction  processes 
I:  Finite  element  solutions,  B.  Lu,  M.  Holst,  J.  McCammon,  and  Y  Zhou,  J.  of  Comput. 
Phys.  229  (2010),  6679-7794  (16  pp). 

•  Analysis  of  a  general  family  of  regularized  Navier-Stokes  and  MHD  models,  M.  Holst,  E. 
Lunasin,  and  G.  Tsogtgerel,  J.  Nonlin.  Sci.,  20  (2010),  pp.  523-567. 

The  following  book  chapter  appeared  during  September  2,  2009  -  September  1,  2010 

•  Error  estimation  for  multiscale  operator  decomposition  for  multiphysics  problems,  D.  Es¬ 
tep,  Chapter  1 1,  in  Bridging  the  Scales  in  Science  and  Engineering,  J.  Fish,  editor,  Oxford 
University  Press,  2010 

The  following  books  were  under  contract  or  appeared  during  September  2,  2009  -  April  5,  2013 

•  Practical  Analysis  in  Many  Variables,  D.  Estep,  SIAM,  2010. 

•  Green’s  Functions  and  Boundary  Value  Problems,  Third  Edition,  I.  Stakgold  and  M.  Holst, 
John- Wiley,  888  pages,  February  2011. 

The  following  nonrefereed  papers  appeared  during  September  2,  2009  -  September  1,  2010 

•  CSE  2009:  Graduate  Education  in  CSE  -  Structure  for  the  Zoo?,  H.-J.  Bungartz  and  D. 
Estep,  SIAM  News  42,  2009 

•  Computational  Science  and  Engineering  Education:  SIAM’s  Perspective,  H.-J.  Bungartz,  D. 
Estep,  U.  Rude,  and  P.  Turner,  IEEE  Computing  in  Science  and  Engineering  1 1  (2009),  5-11 

•  Interview  with  Chief  Editor  of  the  SIAM  CSE  Book  Series,  D.  Estep,  SIAM  News  43  (2010) 


11 


The  following  papers  were  accepted  or  appeared  during  September  2,  2010  -  September  1,  2011 

•  A  computational  measure  theoretic  method  for  inverse  sensitivity  problems  1:  Basic  method 
and  analysis,  J.  Breidt,  T.  Butler,  and  D.  Estep,  SIAM  Journal  on  Numerical  Analysis,  2011, 
49  (2011),  1836-1859 

•  A  posteriori  error  analysis  for  a  cut  cell  finite  volume  method,  D.  Estep,  S.  Tavener,  M. 
Pemice,  H.  Wang,  Computer  Methods  in  Applied  Mechanics  and  Engineering,  2010,  233 
(2009),  459-472 

•  Parameter  estimation  and  directional  leverage  with  applications  in  differential  equations,  N. 
Burch,  D.  Estep,  and  J.  Hoeting,  Metrica,  Metrika,  DOI:  l0.1(X)7/s00 184-01 1-0358-4,  2011 

•  Continuum  Modeling  and  Control  of  Large  Mobile  Networks,  Y.  Zhang,  E.  K.  P.  Chong,  J. 
Hannig,  and  D.  Estep,  Proceedings  of  the  49th  Annual  Allerton  Conference  on  Communica¬ 
tion,  Control  and  Computing,  Illinois,  2011 

•  Nonparameteric  density  estimation  for  randomly  perturbed  elliptic  problems  111:  Conver¬ 
gence,  complexity,  and  generalizations,  D.  Estep,  M.  Holst,  and  A.  Malqvist,  Journal  of 
Applied  Mathematics  and  Computing  38  (2012),  367-387 

•  An  efficient,  reliable  and  robust  error  estimator  for  elliptic  problems  in  E^,  M.  Holst,  J. 
Ovall,  and  R.  Szypowski,  Applied  Numerical  Mathematics,  61  (2011),  675695 

•  Efficient  mesh  optimization  schemes  based  on  optimal  delaunay  triangulations,  L.  Chen  and 
M.  Holst,  Computer  Methods  in  Applied  Mechanics  and  Engineering  200  (201 1),  967984 

•  Adaptive  finite  element  modeling  techniques  for  the  Poisson-Boltzmann  equation,  M.  Holst, 
J.  McCammon,  Z.  Yu,  Y.  Zhou,  and  Y  Zhu,  Communications  in  Computational  Physics,  1 1 
(2012),  pp.  179-214. 

•  Convergence  analysis  of  finite  element  approximations  of  the  Joule  heating  problem  in  three 
spatial  dimensions,  M.  Holst,  M.  Larson,  A.  Malqvist,  and  R.  Soderlund,  BIT,  50  (2010), 
pp. 781-795. 

•  Semilinear  mixed  problems  on  Hilbert  complexes  and  their  numerical  approximation,  M. 
Holst  AND  A.  Stem,  Foundations  of  Computational  Mathematics,  2010,  12  (2012),  pp.  363- 
387 

•  Adaptive  solution  of  the  Poisson-Boltzmann  equation  using  goal-oriented  error  indicators, 
B.  Aksoylu,  S.  Bond,  E.  Cyr,  AND  M.  Holst,  J.  Sci.  Comput.  52  (2012),  202-225  (23  pp). 

The  following  papers  were  accepted  or  appeared  during  September  2,  2011  -  September  1,  2012 

•  A  computational  measure  theoretic  approach  to  inverse  sensitivity  problems  11:  A  posteriori 
error  analysis,  T.  Butler,  D.  Estep  and  J.  Sandelin,  SLAM  Journal  on  Numerical  Analysis,  50 
(2012) 

•  Viscoelastic  Effects  During  Loading  Play  an  Integral  Role  in  Soft  Tissue  Mechanics,  K. 
Troyer,  D.  Estep,  and  C.  Puttlitz,  Acta  Biomaterialia  8  (2012),  234-244 

•  A  posteriori  analysis  of  multirate  numerical  method  for  ordinary  differential  equations,  D. 
Estep,  V.  Ginting,  S.  Tavener,  2012,  Computer  Methods  in  Applied  Mechanics  and  Engi¬ 
neering,  223-224  (2012),  10-27 

•  Adaptive  error  control  for  an  elliptic  optimization  problem.  Applicable  Analysis,  D.  Estep 
and  S.  Lee,  2012,  DOI:  10.1080/0003681 1.2012.683785,  1-15 
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•  Analysis  of  routing  protocols  and  interference-limited  communication  in  large  networks  via 
continuum  modeling,  N.  Burch,  E.  Chong,  D.  Estep,  J.  Hannig,  Journal  of  Engineering  Math¬ 
ematics,  2012,  (DOI)  10. 1007/s  10665 -01 2-9566-9 

•  A  numerical  method  for  solving  a  stochastic  inverse  problem  for  parameters,  T.  Butler  and 
D.  Estep,  Annals  of  Nuclear  Energy,  2012,  10.10l6/j.anucene. 2012.05.016 

•  Geometric  variational  crimes:  Hilbert  complexes,  finite  element  exterior  calculus,  and  prob¬ 
lems  on  hypersurfaces,  M.  Holst  and  A.  Stem,  Foundations  of  Computational  Mathematics, 
12(2012),  pp.  263-293. 

•  Multi-scale  modeling  of  calcium  dynamics  in  ventricular  myocytes  with  realistic  transverse 
tubules,  Z.  Yu,  G.  Yao,  M.  Hoshijima,  A.  Michailova,  and  M.  Holst,  IEEE  TBME  Let¬ 
ters,  Special  Issue  on  Multi-Scale  Modeling  and  Analysis  for  Computational  Biology  and 
Medicine,  58  (2011),  No.  10,  2947-2951  (4  pp). 

•  Multiscale  continuum  modeling  and  simulation  of  biological  processes:  From  molecular 
electro-diffusion  to  sub-cellular  signaling  transduction,  Y.  Cheng,  M.  Holst,  J.  McCammon, 
and  A.  Michailova,  Comput.  Sci.  Disc.,  5  (2012),  015002-015015  (13  pp). 

•  The  Navier-Stokes-Voight  model  for  image  inpainting,  M.  Ebrahimi,  M.  Holst,  and  E.  Lu- 
nasin,  IMA  J.  Appl.  Math.,  doi:10.1093/imamat/hxr069  (2012),  1-26  (26  pp). 

•  Numerical  bifurcation  analysis  of  conformal formulations  of  the  Einstein  constraints,  M.  Holst 
and  V.  Kungurtsev,  Phys.  Rev.  D,  84  (2011),  pp.  124038(1)-124038(8). 

•  Modeling  cardiac  calcium  sparks  in  a  three-dimensional  reconstruction  of  a  calcium  re¬ 
lease  unit,  J.  Hake,  A.  Edwards,  Z.  Yu,  P.  Kekenes-Huskey,  A.  Michailova,  A.  McCammon, 
M.  Holst,  M.  Hoshijima,  and  A.  McCulloch,  J.  Physiol.,  590  (2012),  No.  18, 4403-4422  (18 
pp)- 

•  Localized  glaucomatous  change  detection  within  the  proper  orthogonal  decomposition  frame¬ 
work,  M.  Balasubramanian,  D.  Kriegman,  C.  Bowd,  M.  Holst,  R.  Winreb,  P.  Sample,  and 
L.  Zangwill,  Invest.  Ophthalmol.  Vis.  Sci.,  53  (2012),  No.  7,  3615-3628  (14  pp). 

•  Quality  tetrahedral  mesh  smoothing  via  boundary-optimized  Delaunay  triangulation,  Z.  Gao, 
Z.  Yu,  and  M.  Holst,  Computer  Aided  Geometric  Design,  29(9):707-721, 2012. 

•  Modeling  effects  ofL-type  Ca2+  current  and  Na-\-Ca2+  exchanger  on  Ca2+  trigger  flux  in 
rabbit  myocytes  with  realistic  T-tubule  geometries,  P.  Kekenes-Huskey,  Y.  Cheng,  J.  Hake, 
F.  Sachse,  J.  Bridge,  M.  Holst,  J.  McCammon,  A.  McCulloch,  and  A.  Michailova,  Frontiers 
in  Physiology,  3  (2012),  pp.  1-14. 

The  following  papers  were  accepted,  appeared  or  were  submitted  and  still  pending  review  during 
September  2,  2011  -  September  1,  2012 

•  A  Posteriori  Analysis  and  Adaptive  Error  Control  for  Multiscale  Operator  Decomposition 
Solution  of  Elliptic  Systems  11:  Fully  Coupled  Systems,  V.  Carey,  D.  Estep,  S.  Tavener,  Inter¬ 
national  Journal  of  Numerical  Methods  in  Engineering,  2011,  in  revision 

•  A  posteriori  analysis  of  an  iterative  multi-discretization  method  for  reaction-diffusion  sys¬ 
tems,  J.  H.  Chaudhry,  D.  Estep,  V.  Ginting,  and  S.  Tavener,  Computer  Methods  in  Applied 
Mechanics  and  Engineering,  2012,  in  revision 

•  A-posteriori  error  estimates  for  mixed  finite  element  and  finite  volume  methods  for  problems 
coupled  through  a  boundary  with  non-matching  grids,  T.  Arbogast,  D.  Estep,  B.  Sheehan, 
and  S.  Tavener,  IMA  J.  Numerical  Analysis,  2012,  in  revision 
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•  Multilevel  preconditioners  for  discontinuous  Galerkin  approximations  of  elliptic  problems 
with  jump  coefficients,  B.  Ayuso  de  Dios,  M.  Holst,  Y.  Zhu,  and  L.  Zikatanov,  in  Proceedings 
of  the  Twentieth  International  Conference  on  Domain  Decomposition  Methods,  San  Diego, 
USA,  San  Diego,  CA,  USA,  February  201 1. 

•  Local  multilevel  preconditioners  for  elliptic  equations  with  jump  coefficients  on  bisection 
grids,  L.  Chen,  M.  Holst,  J.  Xu,  and  Y.  Zhu,  Submitted  for  publication. 

•  Local  convergence  of  adaptive  methods  for  nonlinear  partial  differential  equations,  M.  Holst, 

G.  Tsogtgerel,  and  Y.  Zhu,  Submitted  for  publication. 

•  The  Lichnerowicz  equation  on  compact  manifolds  with  boundary,  M.  Holst  and  G.  Tsogt¬ 
gerel,  Submitted  for  publication. 

•  Adaptive  finite  element  methods  with  inexact  solvers  for  the  nonlinear  Poisson-Boltzmann 
equation,  M.  Holst,  R.  Szypowski,  and  Y.  Zhu,  in  Proceedings  of  the  Twentieth  International 
Conference  on  Domain  Decomposition  Methods,  San  Diego,  USA,  San  Diego,  CA,  USA, 
February  2011. 

•  Barrier  methods  for  critical  exponent  problems  in  geometric  analysis  and  mathematical 
physics,  J.  Erway  and  M.  Holst,  Submitted  for  publication. 

•  Finite  element  error  estimates  for  critical  exponent  semilinear  problems  without  angle  con¬ 
ditions,  R.  Bank,  M.  Holst,  R.  Szypowski,  and  Y.  Zhu,  Submitted  for  publication. 

•  Convergence  and  optimality  of  goal-orientied  adaptive  finite  element  methods  for  nonsym- 
metric  problems,  M.  Holst  and  S.  Pollock,  Submitted  for  publication. 

•  Generalized  solutions  to  semilinear  elliptic  PDE  with  applications  to  the  Lichnerowicz  equa¬ 
tion,  M.  Holst  and  C.  Meier,  Submitted  for  publication. 

•  Finite  element  exterior  calculus  for  evolution  problems,  A.  Gillette  and  M.  Holst,  Submitted 
for  publication. 

•  Two-grid  methods  for  semilinear  interface  problems,  M.  Holst,  R.  Szypowski,  and  Y.  Zhu, 
Accepted  for  publication  in  Numer.  Methods  Partial  Differtial  Equations. 

•  Convergence  of  goal-oriented  adaptive  finite  element  methods  for  semilinear  problems,  M.  Holst, 

S.  Pollock,  and  Y.  Zhu,  Submitted  for  publication. 

•  Feature-preserving  surface  mesh  smoothing  via  suboptional  Delaunay  triangulation,  Z.  Gao, 

Z.  Yu,  and  M.  Holst,  Graphical  Models,  75  (2013),  pp.  23-38. 

The  following  papers  were  accepted,  appeared  or  were  submitted  and  still  pending  review  during 
September  2,  2012  -  April  5,  2012 

•  Multiphysics  Simulations:  Challenges  and  Opportunities,  D.  E.  Keyes,  L.  C.  Mclnnes,  C. 
Woodward,  W.  Gropp,  E.  Myra,  M.  Pemice,  J.  Bell,  J.  Brown,  A.  Clo,  J.  Connors,  E.  Con- 
stantinescu,  D.  Estep,  K.  Evans,  C.  Farhat,  A.  Hakim,  G.  Hammond,  G.  Hansen,  J.  Hill, 

T.  Isaac,  X.  Jiao,  K.  Jordan,  D.  Kaushik,  E.  Kaxiras,  A.  Koniges,  K.  Lee,  A.  Lott,  Q.  Lu, 

J.  Magerlein,  R.  Maxwell,  M.  McCourt,  M.  Mehl,  R.  Pawlowski,  A.  Peters  Randles,  D. 
Reynolds,  B.  Riviere,  U.  Ruede,  T.  Scheibe,  J.  Shadid,  B.  Sheehan,  M.  Shephard,  A.  Siegel, 

B.  Smith,  X.  Tang,  C.  Wilson,  and  B.  Wohlmuth,  International  Journal  of  High  Performance 
Computing  Applications  (27),  2013. 
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•  Continuum  Modeling  and  Control  of  Large  Nonuniform  Wireless  Networks  via  Nonlinear 
Partial  Differential  Equations,  Y.  Zhang,  E.  Chong,  J.  Hannig,  and  D.  Estep,  Abstract  and 
Applied  Analysis  (16),  2013,  doi:10.1 155/2013/262581,  1-16 

•  A  posteriori  error  estimates  for  explicit  time  integration  methods,  J.  Collins,  D.  Estep  and  S. 
Tavener,  BIT  Numerical  Mathematics,  2012,  submitted 

•  Continuum  Limits  of  Markov  Chains  with  Application  to  Wireless  Network  Modeling,  Y. 
Zhang,  E.  Chong,  J.  Hannig,  and  D.  Estep,  IEEE  Access,  2013,  submitted 

•  A  posteriori  error  estimation  for  the  Lax-Wendroff finite  difference  scheme,  J.  B.  Collins,  D. 
Estep,  and  S.  Tavener,  Journal  of  Computational  and  Applied  Mathematics,  2013,  submitted 

•  Convergence  and  optimality  of  adaptive  methods  in  the  Finite  Element  Exterior  Calculus 
framework,  M.  Holst,  A.  Mihalik,  and  R.  Szypowski,  Submitted  for  publication. 

•  An  alternative  between  non-unique  and  negative  yamabe  solutions  to  the  conformal  formu¬ 
lation  of  the  einstein  constraint  equations,  M.  Holst  and  C.  Meier,  Submitted  for  publication. 

•  Non-uniqueness  of  solutions  to  the  conformal  formulation,  M.  Holst  and  C.  Meier,  Submitted 
for  publication. 

•  Efficient  computational  in  multiscale  geometric  modeling  for  biomolecular  complexes,  T.  Liao, 
Y.  Zhang,  R  Kekenes-Huskey,  A.  Michailova,  M.  Holst,  and  J.  A.  McCammon,  Submitted 
for  publication. 

•  Multilevel  preconditioners  for  discontinuous  Galerkin  approximations  of  elliptic  problems 
with  jump  coefficients,  B.  Ayuso  de  Dios,  M.  Holst,  Y.  Zhu,  and  L.  Zikatanov,  Accepted  for 
publication  in  Math.  Comp. 

4.2.  Presentations  at  meetings,  conferences,  seminars 

The  following  presentations  were  made  during  March  27,  2009  •  September  1,  2009 

Burch:  Research  Seminar,  Sandia  National  Laboratory,  Albuquerque,  New  Mexico,  8/09 

Estep:  Computational  Science  and  Engineering  (CSE)  Annual  Research  Symposium,  University 
of  Illinois,  Urbana-Champaign,  Keynote  Speaker,  4/09 

Estep:  SIAM  Annual  Meeting,  Minisymposium  on  Predictive  Computational  of  Multiscale- 
Multiphysics  Applications,  invited  speaker,  7/09 

Estep:  Workshop  on  Simulating  the  Spatial-Temporal  Patterns  of  Anthropogenic  Climate  Change, 
Los  Alamos  Institute  for  Advanced  Studies,  Santa  Fe,  New  Mexico,  invited  speaker,  8/09 

Estep:  Colloquium,  Department  of  Mathematics,  University  of  Wyoming,  9/09 

Holst  25th  Pacific  Coast  Gravity  Meeting  (PCGM25),  Eugene,  Oregon,  4/09 

Holst:  5th  Annual  Structured  Integrators  Workshop,  Caltech,  Pasadena,  California,  Plenary  Speaker, 
5/09 

Holst:  FEniCS  2009  Workshop,  Oslo,  Norway,  Plenary  Speaker,  6/09 

Holst:  Numerische  Mathematik  50,  Munich,  Germany,  Plenary  Speaker,  6/09 

Holst:  Mathematical  and  Numerical  Geometric  Analysis  Workshop,  Frieburg,  Germany,  Plenary 
Speaker,  9/09 
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Holst:  ICNAAM  Conference,  Crete,  Greece,  Minisymposium  Speaker,  9/09 

Serencsa:  CSME  Seminar  Series,  UC  San  Diego,  San  Diego,  California,  6/09 

The  following  presentations  were  made  during  September  2,  2009  -  September  1,  2010 
Burch:  ICMS  Workshop  on  Uncertainty  Quantification,  Edinburgh,  UK,  05/10 

Estep:  Workshop  on  Adaptive  and  Multilevel  Methods  for  Partial  Differential  Equations,  Univer¬ 
sity  of  California  San  Diego,  1 1/09 

Estep:  Seminar,  Lawrence  Livermore  National  Laboratory,  12/09 

Estep:  Colloquium,  Department  of  Atmospheric  Science,  Colorado  State  University,  1/10 

Estep:  Seminar,  University  of  Wisconsin,  2/10 

Estep:  Seminar,  Brown  University,  3/10 

Estep:  Seminar,  University  of  Chicago,  3/10 

Serencsa:  CCoM  Seminar  Series,  UC  San  Diego,  San  Diego,  California,  11/09 

Holst:  Plenary  Lecture,  Symposium  on  Mathematical  Systems  Biology,  UCI,  Irvine,  California, 
1/10 

Holst:  Lecture,  26th  Pacific  Coast  Gravity  Meeting  (PCGM26),  San  Diego,  CA,  3/10 

Holst:  Plenary  Lecture,  Workshop  on  Unstructured  Mesh  Methods  in  Mathematical  Physics,  Jena, 
Germany,  8/10 

Holst:  Invited  Lecture,  Department  of  Mathematics,  Free  University  of  Berlin,  Berlin,  Germany, 
8/10 

Holst:  Invited  Lecture,  Department  of  Mathematics,  Technical  University  of  Berlin,  Berlin,  Ger¬ 
many,  8/10 

Holst:  Invited  Lecture,  Department  of  Mathematics,  Jacobs  University,  Brehmen,  Germany,  9/10 

The  following  presentations  were  made  during  September  2,  2010  -  September  1,  2011 

Estep:  SIAM  Computational  Science  and  Engineering  Conference,  Minisymposia  on  Numerical 
Discretization  Error  Estimation  for  Uncertainty  Quantification,  Progress  in  Computational 
Methods  and  Software  for  Tightly-coupled  Multiphysics  Applications,  Numerical  Methods 
for  Stochastic  Computation  and  Uncertainty  Quantification,  Numerical  Challenges  in  Mi¬ 
crostructure  Modeling  for  Materials  Science,  Reno,  Nevada,  201 1 

Estep:  Seminar,  Lawrence  Livermore  National  Laboratory,  9/10 

Estep:  Seminar,  Purdue  University,  9/10 

Estep:  Seminar,  North  Carolina  State  University,  11/10 

.  Estep:  Seminar,  Lawrence  Livermore  National  Laboratory,  1/1 1 

Estep:  Seminar,  University  of  Southern  California,  3/11 

Estep:  Plenary  Talk,  ICiS  Workshop  on  Multiphysics  Simulations:  Challenges  and  Opportunities, 
Park  City,  Utah,  8/1 1 
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Holst:  Invited  Lecture,  Department  of  Mathematics,  Jacobs  University,  Bremen,  Germany,  9/10 

Holst:  Invited  Lecture,  Workshop  on  Latest  Trends  and  Developments  in  Computational  Tech¬ 
nology  and  Methods  for  Solids,  Structures,  Fluids  and  Fluid-Structure  Interaction,  La  Jolla, 
CA,  9/10 

Holst:  Invited  ICES  Lecture,  University  of  Texas,  Austin,  TX,  2/1 1 

Holst:  Invited  CVS  Lecture,  University  of  Texas,  Austin,  TX,  2/1 1 

Holst:  Colloquium,  Department  of  Mathematics,  University  of  Wisconsin,  Madison,  WI,  4/11 

Holst:  Colloquium,  Department  of  Mathematics,  The  Penn  State  University,  State  College,  PA, 
4/11 

Holst:  Colloquium,  Department  of  Applied  Mathematics,  University  of  Washington,  Seattle,  WA, 
5/11 

Holst:  Seminar,  Pacific  Northwest  National  Laboratory,  Richland,  WA,  5/1 1 

Holst:  Plenary  Lecture,  Workshop  on  Advances  and  Challenges  in  Computational  General  Rela¬ 
tivity,  Brown  University,  Providence,  RI,  5/1 1 

Holst:  Invited  Lecture,  Schnelle  Loser  fiir  partielle  Differentialgleichungen,  Mathematisches 
Forschungsinstitut  Oberwolfach,  Oberwolfach,  Germany,  5/11 

The  following  presentations  were  made  during  September  2,  2011  -  September  1,  2012 

Estep:  Invited  Lecture,  Uncertainty  Quantification  for  High-Performance  Computing  Workshop, 
Oak  Ridge  National  Laboratory,  5/12 

Estep:  Invited  Lecture,  6th  International  Conference  on  Automatic  Differentiation,  Fort  Collins, 
CO,  7/12 

Estep:  Invited  Paper,  Joint  Statistical  Meetings,  8/12 

Estep:  Invited  Seminar,  University  of  Chicago,  9/1 1 

Estep:  Invited  Seminar,  Florida  State  University,  4/12 

Estep:  Invited  Seminar,  Colorado  School  of  Mines,  4/12 

Estep:  Invited  Colloquium,  Statistical  and  Applied  Mathematical  Sciences  Institute  (SAMSI), 
4/12 

Holst:  Invited  Lecture,  Workshop  on  Geometric  Partial  Differential  Equations:  Theory,  Numer¬ 
ics  and  Appli-  cations,  Mathematisches  Forschungsinstitut  Oberwolfach,  Oberwolfach,  Ger¬ 
many,  11/11 

Holst:  Invited  Lecture,  JTO  Faculty  Fellowship  Lecture  (1  of  2),  Institute  for  Computational 
Engineering  and  Science  (ICES),  University  of  Texas,  Austin,  TX,  11/11 

Holst:  Invited  Lecture,  JTO  Faculty  Fellowship  Lecture  (2  of  2),  Institute  for  Computational 
Engineering  and  Science  (ICES),  University  of  Texas,  Austin,  TX,  1/12 

Holst:  Plenary  Lecture,  CSU  Research  Colloquium,  Physics  at  CSU:  Neutrinos  to  Nano  Science, 
Colorado  State  University,  Fort  Collins,  CO,  3/12 

Holst:  Plenary  Lecture,  21st  International  Conference  on  Domain  Decomposition  Methods,  Rennes, 
Frances,  6/12 
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The  following  presentations  were  made  during  September  2,  2012  -  April  5,  2013 

Pollock:  Center  for  Computational  Mathematics  Seminar,  UCSD,  San  Diego,  CA,  1/13. 

Pollock:  Joint  MAA-AMS  Mathematics  Meetings,  San  Diego,  CA,  1/13. 

Pollock:  Numerical  analysis  seminar,  Texas  A&M  University,  College  Station,  TX,  4/13. 

Pollock:  CSME  Seminar,  UCSD,  San  Diego,  CA,  4/13. 

Pollock:  Minisymposium  Lecture,  SIAM  Annual  Meeting,  San  Diego,  CA,  7/13. 

4.3.  Websites 

Research  results  and  software  are  presented  at 

•  http://www.stat.colostate.edu/~estep/ 

•  http://ccom.ucsd.edu/~mholst/ 

4.4.  Technologies  and  techniques 

Over  the  last  several  years,  our  DTRA-supported  research  team  has  led  the  development  of 
the  Finite  Element  ToolKit,  which  is  an  opensource  finite  element  modeling  toolkit  designed  for 
the  simulation  of  coupled  multiphysics  problems  with  multiscale  phenomena.  The  software  has 
been  designed  and  developed  collaboratively  by  both  Holst  and  Estep,  and  consists  of  a  collection 
of  object-oriented  class  libraries  written  in  C,  C++,  Objective  C,  and  Python.  There  is  also  a 
MATLAB/Octave-based  prototyping  tool  (FETKLab),  the  development  of  which  has  been  done 
by  both  Estep  and  Holst,  as  well  as  several  of  their  graduate  students.  FETK  (and  FETKLab)  are 
designed  to  adaptive  discretize  and  solve  coupled  reaction-diffusion  systems,  and  is  based  around 
state-of-the-art  algorithms  for  simplex  mesh  generation,  error  estimation,  mesh  refinement,  finite 
element  discretization,  iterative  nonlinear  and  optimization  techniques,  and  fast  multilevel  and 
domain  decomposition-based  linear  solvers  and  preconditions.  Many  of  the  algorithms  developed 
in  our  research  articles  as  described  in  this  report  have  been  prototyped,  implemented,  and  applied 
to  applications  in  conjunction  with  physical  scientists  using  FETK.  The  entire  FETK  source  tree 
was  released  in  June  2010  on  the  FETK.org  website,  as  a  major  milestone  of  this  DTRA  project. 
A  substantial  extention  to  both  FETK  and  FETKLab  was  completed  in  Spring  2013  that  added 
general  Lagrange-type  elements  for  either  primal  or  dual  problems,  and  this  new  capability  has 
been  exploited  in  a  number  of  our  recent  articles. 

In  addition,  we  continue  development  on  GAASP  (Globally  Accurate,  Adaptive  Sensitivity 
analysis  Package)  to  extend  its  capabilities  for  both  forward  and  inverse  stochastic  sensitivity  anal¬ 
ysis  of  differential  equations. 

5.  Impact 

5.1.  Impact  on  the  principal  disciplines  of  the  project 

The  numerical  solution  of  multiscale,  multiphysics  models  on  complex  domains  along  with 
the  development  of  tools  for  predictive  science  and  uncertainty  quantification  is  one  of  the  grand 
challenges  facing  the  mathematical  sciences  at  present.  Such  problems  present  a  very  complex  pic¬ 
ture  in  terms  of  stability  and  important  behaviors  interacting  across  a  wide  range  of  scales,  which 
makes  the  straightforward  use  of  classical  numerical  methods  and  analyses  extremely  problematic, 
if  not  impossible.  Classic  approaches  were  developed  in  the  context  of  models  involving  single 
physics  phenomena  operating  at  a  narrow  range  of  scales.  While  building  on  classic  approaches, 
the  research  in  this  project  contributes  at  a  fundamental  theoretical  level  by  laying  the  foundation 
for  reliably  accurate  and  efficient  numerical  solution  based  on  a  posteriori  error  analysis  that  ac¬ 
counts  for  the  numerical  complexities  involved  with  simulating  such  systems.  This  is  achieved 
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by  combining  extremely  sophisticated  mathematics  in  analysis  and  geometry  with  cutting  edge 
numerical  methodology. 

The  impact  of  the  research  related  to  this  project  is  widespread,  as  can  be  seen  in  the  greatly 
increasing  levels  of  activity  around  the  world  on  such  problems.  This  is  also  evidenced  by  the 
number  of  invitations  to  speak,  the  number  of  funded  interdisciplinary  projects  including  a  recent 
award  of  an  extremely  prestigious  National  Science  Foundation  Focused  Research  Group  (FRG) 
award  to  Estep  and  Holst,  the  citation  record  (Estep’s  h-index  is  15  and  Holst’s  h-index  is  20), 
and  the  high  level  of  the  involvement  of  the  Pi’s  in  research  environment  through  panels,  reports, 
editing,  and  so  on. 

5.2.  Impact  on  other  disciplines 

Developing  reliable  and  accurate  tools  for  carrying  out  predictive  science  and  engineering  for 
multiscale,  multiphysics  systems  on  complex  domains  and  conducting  uncertainty  quantification 
in  simulated  results  is  the  major  problem  of  computational  science  and  engineering  at  present. 
Addressing  this  challenge  requires  fundamental  research  in  the  mathematical  sciences.  This  project 
is  aimed  at  addressing  a  number  of  key  research  problems  involved  with  simulating  multiphysics 
systems.  Along  with  theory,  the  Pis  systematically  implement  the  results  into  public  software, 
and,  along  with  their  collaborators,  use  the  software  to  tackle  scientific  and  engineering  research 
problems.  This  yields  a  direct  transfer  of  the  theoretical  mathematical  developments  and  software 
implementations  to  the  application  domain. 

This  is  evidenced  by  the  large  number  of  interdisciplinary  collaborations  of  the  Pis  and  the 
substantial  volume  of  interactions  with  Department  of  Energy  laboratories  and  industry.  Details 
are  provided  below. 

5.5.  Impact  in  the  profession 

5.4.  Honors  and  awards 

Estep  was  appointed  (founding)  Co-Editor  in  Chief  of  the  SIAM  /  ASA  Journal  on  Uncertainty 
Quantification 

Estep  won  the  University  Scholarship  Impact  Award,  Colorado  State  University,  201 1 

Estep  was  appointed  University  Interdisciplinary  Research  Scholar,  Colorado  State  University  in 
2009 

Estep  received  the  Oliver  P.  Pennock  Distinguished  Service  Award,  Colorado  State  University  in 
2009 

Estep  was  appointed  Editor  in  Chief,  SIAM  Book  Series  on  Computational  Science  and  Engi¬ 
neering,  2009  -  2014 

Holst  received  the  CSU  Distinguished  Alumnus  Award,  2009 

Holst  was  appointed  the  Chancellor’s  Associates  Endowed  Chair  in  Mathematics  and  Physics  at 
UC  San  Diego  in  2012 

5.5.  Impact  on  the  professional  research  community 

Estep  served  as  one  of  the  Moderators  for  the  SAMSI  National  SIAM  and  ASA  Town  Hall  Meet¬ 
ing  on  Uncertainty  Quantification,  2010 

Estep  served  as  the  Co-Organizer  and  first  Chair,  SIAM  Activity  Group  on  Uncertainty  Quantifi¬ 
cation,  2010 

Estep  served  as  a  Program  Leader  for  the  SAMSI  Program  on  Uncertainty  Quantification,  201 1- 
2012 
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Estep  served  as  co-chair  of  the  first  SIAM/AS  A/US  ACM  Conference  on  Uncertainty  Quantifica¬ 
tion  (April,  2012) 

Estep  along  with  J.  Berger  (Duke)  and  M.  Gunzburger  (FSU)  proposed  a  new  Journal  on  Uncer¬ 
tainty  Quantification  to  be  jointly  published  by  the  ASA  and  SIAM 

Estep  serves  on  the  Advisory  Board  for  the  Center  for  Advanced  Modeling  and  Simulation,  Idaho 
National  Laboratory,  2009  -  2012 

Estep  serves  on  the  Governing  Board  of  the  Statistical  and  Applied  Mathematical  Sciences  Insti¬ 
tute  (SAMSI),  2009-2016 

Estep  served  on  the  National  Science  Foundation  Office  of  Cyberinfrastructure  Grand  Challenges 
Communities  Task  Force,  2009-2010  (co-author  of  final  recommendation  report) 

Estep  served  as  Breakout  Lead  and  Report  co-author.  Uncertainty  Quantification  and  Stochastic 
Systems,  Department  of  Energy  Cross-Cutting  Technologies  for  Computing  at  the  Exascale, 
2010 

Estep  was  an  invited  participant  in  the  Fusion  Simulation  Program  Definition  Workshop,  2011 

Estep  serves  on  the  American  Mathematical  Society  Simmons  Travel  Grants  Committee,  201 1- 
2014 

Estep  serves  as  Moderator,  Mathematics  in  the  Geosciences  Workshop,  Northwestern  University, 
2011 

Estep  was  co-author  of  Multiphysics  Simulations:  Challenges  and  Opportunities,  Tech.  Report 
ANL/MCS-TM-321,  Argonne  National  Laboratory,  201 1 

Estep  was  co-author  of  Fostering  Interactions  Between  the  Geosciences  and  Mathematics,  Statis¬ 
tics,  and  Computer  Science,  Technical  Report  TR-2012-02,  Department  of  Computer  Sci¬ 
ence,  University  of  Chicago,  2012 

Holst  serves  on  the  Executive  Committee  for  the  San  Diego  Supercomputer  Center  (SDSC),  2007- 
present 

Holst  is  a  Co-Organizer  (with  R.  Bank)  of  20th  International  Conference  on  Domain  Decompo¬ 
sition  (DD20),  February  2011. 

Holst  is  the  Primary  Organizer  (with  J.  Hameed):  Numerical  Methods  for  Implicit  Models  in 
Biomolecular  Systems,  SIAM  CS&E  Conference  Minisymposium,  March  201 1 

Holst  is  the  Primary  Organizer  (with  A.  Demlow,  A.  Gillette,  Y.  Zhu):  Workshop  on  Exploit¬ 
ing  Geometry  in  the  Development  of  Numerical  Methods  for  Partial  Differential  Equations, 
UCSD  Workshop,  San  Diego,  November  2011. 

Holst  is  the  Primary  Organizer  (with  A.  Demlow,  R.  Szypowski):  Exploiting  Geometry  in  the 
Development  of  Numerical  Methods  for  Partial  Differential  Equations,  SIAM  Analysis  of 
PDE  Conference  Minisymposium,  November  201 1 . 

Holst  is  the  Primary  Organizer  (with  D.  Arnold,  A.  Gillette);  AMS  Joint  Meeting  FEEC  Min¬ 
isymposium,  on  New  Developments  in  the  Finite  Element  Exterior  Calculus,  January  2013. 

Holst  is  the  Primary  Organizer  (with  A.  Gillette,  R.  Szypowski):  Workshop  on  Exploiting  Geom¬ 
etry  in  the  Development  of  Numerical  Methods  for  Partial  Differential  Equations  II,  UCSD 
Workshop,  San  Diego,  January  2013. 

Holst  and  Estep  regularly  serve  on  Grant  Review  Panels  for  NSF  and  DOE,  2004-present 
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5.6.  Professional  editorial  appointments 

Estep;  CO  Editor  in  Chief  (founding),  SIAM  /  ASA  Journal  on  Uncertainty  Quantification 

Estep:  Editor  in  Chief,  SIAM  Book  Series  on  Computational  Science  and  Engineering,  2009  - 
2014 

Estep:  Associate  Editor,  SIAM  Journal  on  Numerical  Analysis,  2005-201 1 

Estep:  Associate  Editor,  International  Journal  for  Uncertainty  Quantification,  2010- 

Estep:  Associate  Editor,  Multiphysics  Modeling  Book  Series,  A.  A.  Balkema  Publishing,  CRC 
Press,  2010- 

Estep:  Associate  Editor,  Journal  of  Applied  Mathematics  and  Computing,  2008-2013 

Holst:  Associate  Editor,  Numerische  Mathematik,  2008-present 

Holst:  Associate  Editor,  SIAM  Book  Series  on  Computational  Science  and  Engineering,  2009- 
2014 

5. 7.  Impact  on  technology  transfer 

The  Pis  maintain  a  very  substantial  interdisciplinary  collaboration  activity  with  scientists  and 
engineers  in  universities,  Department  of  Energy  laboratories,  and  industry.  These  collaborations 
lead  to  direct  injection  of  research  ideas  into  practical  use. 

5.8.  Consulting  and  collaborative  activities 

In  this  section,  we  report  currently  funded  projects  that  involve  substantial  interdisciplinary 
collaborations  and  transfer  of  research  results  related  to  this  project  into  applications. 

Estep  is  CO-PI  on  the  project  Framework  Application  for  Core-Edge  Transport  Simulations  ( FACETS) 
funded  by  the  Office  of  Advanced  Scientific  Computing  Research  and  Office  of  Fusion  En¬ 
ergy  Sciences,  Department  of  Energy,  2007-12.  Collaborators  include:  R.  H.  Cohen,  L.  Di- 
achin,  and  T.  Epperly  at  Lawrence  Livermore  National  Laboratory;  J.  Larson  and  L.  Mclnnes 
at  Argonne  National  Laboratory;  M.  R.  Fahey  and  J.  Cobb  at  Oak  Ridge  National  Laboratory. 
Subject  is  development  and  analysis  of  numerical  solution  methods  for  coupled  core-edge 
fusion  simulations. 

Estep  is  PI  on  the  project  Collaborative  Proposal:  Transforming  How  Climate  System  Models 
are  Used:  A  Global,  Multi-Resolution  Approach  to  Regional  Ocean  Modeling  funded  by  the 
Department  of  Energy,  2009- 1 1 .  Collaborators  include  Todd  Ringler  at  Los  Alamos  National 
Laboratory.  Subject  is  development  and  analysis  of  numerical  methods  for  multiscale  ocean 
models. 

Estep  is  PI  on  the  project  Adjoint-based  methods  for  uncertainty  quantification  funded  by  the 
Lawrence  Livermore  National  Laboratory,  2010-13.  Collaborators  are  Carol  Woodward  and 
Jeff  Hittinger  at  Lawrence  Livermore  National  Laboratory.  Duties  include  (1 )  pursue  develop 
a  posteriori  error  estimates  for  hyperbolic  problems  including  shock  behavior  and  (2)  consult 
on  uncertainty  and  error  quantification  with  laboratory  personnel 

Estep  is  CO-PI  on  the  project  The  Inverse  Problem  for  Estimation  of  Structure  of  Biological  Macro¬ 
molecules  from  Small-Angle  X-Ray  Scattering  funded  by  the  National  Institutes  of  Health, 
2010-2014.  Collaborators  include  Jay  Breidt  (Statistics,  CSU)  and  Karolin  Luger  (Biochem¬ 
istry,  CSU).  Subject  is  determining  the  structure  of  biological  macromolecules  using  small 
angle  x-ray  scattering  data. 


21 


Estep  is  PI  on  the  project  Enabling  Predictive  Simulation  and  UQ  of  Complex  Multiphysics 
PDE  Systems  by  the  Development  of  Goal-Oriented  Variational  Sensitivity  Analysis  and 
a-Posteriori  Error  Estimation  Methods  funded  by  the  Department  of  Energy,  2010-2013. 
Collaborators  include  John  Shadid  (Sandia  Nat.  Lab.)  and  Victor  Ginting  (U.  Wyom.). 
Subject  is  developing  a  posteriori  error  estimates  for  solutions  of  reacting  flow  and  fusion 
reaction  models. 

Estep  is  CO-PI  on  the  project  Collaborative  Research:  A  posteriori  error  analysis  and  adaptiv¬ 
ity  for  discontinuous  interface  problems  funded  by  the  National  Science  Foundation,  2010- 
2013.  Collaborator  is  Simon  Tavener  (CSU).  Purpose  is  developing  and  analyzing  conser¬ 
vative  solution  methods  for  elliptic  problems  with  coefficients  that  are  discontinuous  across 
complex  interfaces. 

Estep  is  PI  on  the  CSU  Subcontract  from  Multiscale  Design  Systems,  LLC  supported  by  an 
Air  Force  SBIR  Phase  II  grant.  Collaborators  are  Simon  Tavener  (CSU)  and  Jacob  Fish 
(Columbia  Uni.)  in  2011.  Purpose  is  developing  fast  methods  for  UQ  for  multiscale  models 
of  polymers  in  stressed  environments. 

Estep  is  PI  on  the  project  Uncertainty  Analysis  for  Multiscale  Models  of  Nuclear  Fuel  Perfor¬ 
mance  supported  by  the  Idaho  National  Laboratory  from  2011-2014.  Collaborators  are  Si¬ 
mon  Tavener  (CSU)  and  Michael  Pemice  (Idaho  Nat.  Lab.).  Purpose  is  UQ  for  multiscale 
models  of  nuclear  fuel. 

Estep  is  PI  on  the  project  11-2031:  Multiscale  modeling  and  uncertainty  quantification  for 
nuclear  fuel  performance,  Nuclear  Energy  University  Programs,  Department  of  Energy, 
201 1-14.  Collaborators  are  Simon  Tavener  (CSU),  Michael  Pemice  (INL),  Peter  Polyakov 
(Wyoming),  Dongbin  Xiu  (Purdue),  Anter  el  Azab  (Purdue) 

Estep  is  a  co-PI  on  the  project  Data-Driven  Inverse  Sensitivity  Analysis  for  Predictive  Coastal 
Ocean  Modeling,  Computational  and  Data-Enabled  Science  and  Engineering  in  Mathemati¬ 
cal  and  Statistical  Sciences  (CDS&E-MSS),  National  Science  Foundation,  2012-15.  Collab¬ 
orators  are  Troy  Butler  (CSU),  Clint  Dawson  (U.  Texas  at  Austin),  and  Joannes  Westerink 
(Notre  Dame) 

Estep  and  Holst  are  co-PIs  on  the  project  ERG:  Error  Quantification  and  Control  for  Gravita¬ 
tional  Waveform  Simulation  funded  by  the  National  Science  Foundation,  2011-2014.  The 
Project  is  concerned  with  estimating  the  error  in  computed  wave  forms  obtained  from  LIGO 
data. 

Holst  is  Pis  on  the  project  ERG:  Analysis  of  the  Einstein  Constraint  Equations  funded  by  the 
National  Science  Foundation,  2013-2016.  The  Project  is  concerned  with  further  extending 
the  solution  theory  for  the  Einstein  constraint  equations. 

Holst  is  PI  on  the  project  MRl:  Acquisition  of  a  Parallel  Computing  and  Visualization  Facility  to 
Enable  Integrated  Research  and  Training  in  Modern  Computational  Science,  Mathematics, 
and  Engineering  funded  by  National  Science  Foundation,  2008-2011.  Collaborators  include 
Randolph  Bank  (UCSD  Mathematics),  Scott  Baden  (UCSD  Computer  Science),  and  John 
Weare  (UCSD  Chemistry).  The  subject  is  the  design  and  constmction  of  a  state-of-the-art 
parallel  computing  system  with  an  excess  of  1000  compute  nodes,  Infiniband  high-speed 
network  fabric,  parallel  filesystems,  LCD  vizualization  walls,  housed  in  a  modem  server 
room  with  raised  floor  and  forced  chilled  air. 

Holst  is  PI  on  the  project  Adaptive  Methods  and  Finite  Element  Exterior  Calculus  for  Nonlinear 
Geometric  PDE,  funded  by  National  Science  Foundation,  2012-2015.  Co-PI  is  former  stu¬ 
dent  and  postdoc  Ryan  Szypowski,  now  an  assistant  professor  in  mathematics  at  Cal  Poly 
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Pomona.  The  subject  is  the  design  and  analysis  of  adaptive  methods  for  use  with  the  finite 
element  exterior  calculus. 

Holst  is  Co-PI  on  the  project  Adaptive  Radiotherapy  Based  on  High  Performance  Computing 
funded  by  the  Department  of  Energy,  Lawrence  Livermore  National  Laboratory,  and  the  Uni¬ 
versity  of  California,  2009-2012.  Collaborators  include  Steve  Jian  (UCSD  Medical  School), 
A.  Majumdar  (SDSC),  and  D.J.  Choi  (SDSC).  The  subject  is  realtime  solution  of  coupled 
reaction-diffusion  systems  and  the  Boltzmann  transport  equation  using  a  combination  of  par¬ 
allel  algorithms  for  partial  differential  equations,  high-speed  communication  networks,  and 
cluster  computers. 

Holst  is  Co-PI  on  the  project  Scalable  Adaptive  Multilevel  Solvers  for  Multiphysics  Problems, 
funded  by  the  Department  of  Energy.  The  subject  is  the  design  and  analysis  of  determinstic 
algorithms  for  use  in  physical  simulation  based  on  multilevel  technologies. 

Holst  is  Co-PI  on  the  project  Applications  of  Quantum  Computing  in  Aerospace  Science  and 
Engineering,  funded  by  the  AirForce  Office  of  Scientific  Research.  The  subject  is  the  design 
and  analysis  of  quantum  algorithms  for  use  in  physical  simulation. 

Holst  is  Co-PI  and  Core  lA  lead  on  the  project  National  Biomedical  Computation  Resource 
(NBCR)  funded  by  the  National  Institutes  of  Health,  2009-2014.  Collaborators  include  An¬ 
drew  McCammon  (UCSD  Chemistry),  Andrew  McCulloch  (UCSD  Bioengineering),  Mark 
Ellisman  (UCSD  Medical  School),  and  Peter  Arzberger  (SDSC).  The  subject  is  multiscale 
modeling  frameworks  and  adaptive  finite  element  methods  for  complex  multiscale  and  mul¬ 
tiphysics  problems  arising  in  biomedical  science. 

Holst  is  Senior  Scientist  and  founding  member  of  the  NSF  Physics  Frontier  Center  for  Theoret¬ 
ical  Biological  Physics  (CTBP),  funded  by  the  National  Science  Foundation.  Collaborators 
include  Jose’  Onuchic  (UCSD  Physics),  Andrew  McCammon  (UCSD  Chemistry),  and  Andy 
Kummel  (UCSD  Chemistry).  The  subject  is  multiscale  modeling  frameworks  and  adaptive 
finite  element  methods  for  complex  multiscale  and  multiphysics  problems  arising  in  bio¬ 
physics. 

5.9,  Transitions  to  technology  applications 
We  report  on  current  interactions  with  industry. 

Estep  was  a  Co-Principal  Investigator  in  the  Tech  X,  Inc.  project  Framework  Application  for 
Core-Edge  Transport  Simulations  (FACETS),  funded  by  the  Office  of  Advanced  Scientific 
Computing  Research  and  Office  of  Fusion  Energy  Sciences,  Department  of  Energy.  Estep’s 
responsibilities  include  development  and  analysis  of  numerical  solution  methods  for  coupled 
core-edge  fusion  simulations.  Algorithms  developed  in  this  program  will  be  implemented 
into  the  FACETS  high  performance  framework. 

Estep  was  a  subcontract  in  Phase  II  project  for  Multiscale  Design  Systems,  LLC  (Principal  Officer: 
Jacob  Fish,  Rensselaer  Polytechnic  Institute)  for  the  Air  Force  SBIR/STTR  program.  Es¬ 
tep’s  responsibilities  include  development  of  multiscale  operator  decomposition  numerical 
methods  and  numerical  methods  for  error  estimation,  uncertainty  quantification  and  inverse 
problems  for  parameter  identification  for  multiscale  multiphysics  models  of  hygro-thermo- 
mechano-oxidation-fatigue  in  polymer  matrix  composites  used  in  aircraft  applications.  Al¬ 
gorithms  developed  in  this  program  will  be  implemented  into  the  Multiscale  Design  Sys¬ 
tem  for  Continuum  (MDS-C)  and  the  Multiscale  Design  System  for  Discrete  or  atomistic 
medium  (MDS-D)  software  packages. 
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Holst  is  collaborating  with  Eric  Bylaska  at  Pacific  Northwest  National  Laboratory  on  the  incorpo¬ 
ration  of  the  Finite  Element  Toolkit  (FETK,  developed  and  maintained  by  the  Holst  Group) 
into  several  density  functional  modeling  packages  based  at  PNNL. 
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Abstract.  We  consider  the  nonparametric  density  estimation  problem  for  a  quantity  of  interest 
coinpiitcd  from  solutions  of  an  elliptic  partial  differential  equation  with  randomly  perturbed  coef¬ 
ficients  and  data.  Our  particular  interest  are  problems  for  which  limited  knowledge  of  the  random 
perturbations  are  known.  We  derive  an  efficient  method  for  computing  samples  and  generating  an 
approximate  probability  distribution  based  on  Lion’s  domain  decomposition  method  and  the  Neu¬ 
mann  series.  We  then  derive  an  a  posteriori  error  estimate  for  the  computed  probability  distribution 
reflecting  cill  sources  of  deterministic  and  statistical  errors.  Finally,  we  develop  an  adaptive  error 
control  algorithm  based  on  the  a  posteriori  estimate. 

Key  words,  a  posteriori  error  analysis,  adjoint  problem,  density  estimation,  domain  decom¬ 
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sensitivity  analysis 


AMS  subject  classifications.  65N15,  65N30,  65N55,  65C05 
DOI.  10.1137/080731670 


1.  Introduction.  The  practical  application  of  differential  equations  to  model 
physical  phenomena  presents  problems  in  both  computational  mathematics  and  statis¬ 
tics.  The  mathematical  issues  arise  because  of  the  need  to  compute  approximate  so¬ 
lutions  of  difficult  problems,  while  statistics  arises  because  of  the  need  to  incorporate 
experimental  data  and  model  uncertainty.  The  consequence  is  that  significant  error 
and  uncertainty  attend  any  computed  information  from  a  model  applied  to  a  con¬ 
crete  situation.  The  problem  of  quantifying  that  error  and  uncertainty  is  critically 
important. 

We  consider  the  nonparametric  density  estimation  problem  for  a  quantity  of  in¬ 
terest  computed  from  the  solutions  of  an  elliptic  partial  differential  equation  with  ran¬ 
domly  perturbed  coeffreients  and  data.  The  ideal  problem  is  to  compute  a  quantity 
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of  interest  Q(U),  expressed  as  a  linear  functional,  of  the  solution  U  of 


-V  •  {A{x)VU)  =  G{x),  xen, 

U  =  0,  I  €  5n, 


where  is  a  convex  polygonal  domain  with  boundary  and  A(2:)  and  G{x)  are 
stochastic  functions  that  vary  randomly  according  to  some  given  probability  structure. 
The  problem  (1.1)  is  interpreted  to  hold  almost  surely  (a.s.),  i.e.,  with  probability  1. 
Under  suitable  assumptions,  e.g.,  A  and  G  are  uniformly  bounded  and  have  piecewise 
smooth  dependence  on  their  inputs  (a.s.)  with  continuous  and  bounded  covariance 
functions  and  A  is  uniformly  coercive,  Q{U)  is  a  random  variable.  The  density  estima¬ 
tion  problem  is  as  follows;  Given  probability  distributions  describing  the  stochastic 
nature  of  A  and  G,  determine  the  probability  distribution  of  Q.  The  approach  we 
use  extends  to  problems  with  more  general  Dirichlet  or  Robin  boundary  conditions 
in  which  the  data  for  the  boundary  conditions  are  randomly  perturbed  as  well  as 
problems  witii  more  general  elliptic  operators  in  a  straightforward  way. 

The  parametric  density  estimation  problem  assumes  that  the  output  distribution 
is  one  of  the  standard  distributions  so  that  the  problem  involves  determining  values 
for  the  parameters  defining  the  distribution.  The  nonparametric  density  estimation 
problem  is  relevant  when  the  output  distribution  is  unknown  and/or  complicated, 
e.g.,  multimodal.  In  this  case,  we  seek  to  compute  an  approximate  distribution  for 
the  output  random  variable  using  sample  solutions  of  the  problem.  A  limited  version 
of  this  problem  is  to  seek  only  to  compute  one  or  two  moments,  e.g.,  the  expected 
value.  However,  this  is  of  limited  utility  when  the  output  distribution  is  complicated, 
as  it  tends  to  be  for  outputs  computed  from  (1.1)  under  general  conditions. 

Nonparametric  density  estimation  problems  are  generally  approached  using  a 
Monte  Carlo  sampling  method.  Samples  {A”,G"}  are  drawn  from  their  distribu¬ 
tions,  solutions  {!/"}  are  computed  to  produce  samples  {<3(U")},  and  the  output 
distribution  is  approximated  using  a  binning  strategy  coupled  with  smoothing.  This 
ideal  density  estimation  problem  poses  several  computational  issues,  including  the 
following. 

1.  We  have  only  limited  information  about  the  stochastic  nature  of  A  and  G. 

2.  We  can  compute  only  a  finite  number  TV  of  sample  solutions. 

3.  The  solution  of  (1.1)  has  to  be  computed  numerically,  which  is  both  expensive 
and  leads  to  significant  variation  in  the  numerical  error  as  the  coefficients  and 
data  vary. 

4.  The  output  distribution  is  an  approximation  affected  by  the  binning  and 
smoothing  strategies. 

In  this  paper,  we  consider  the  first  three  issues.  Our  goals  are  to  construct  an  ef¬ 
ficient  numerical  method  for  approximating  the  cumulative  density  function  for  the 
output  distribution  and  to  derive  computable  a  posteriori  error  estimates  that  account 
for  the  significant  effects  of  error  and  uncertainty  in  the  approximation.  We  fully  de¬ 
velop  the  adaptive  algorithm,  extend  the  analysis  to  include  adaptive  modeling,  and 
test  the  algorithm  on  several  problems  in  [5].  In  [3],  we  present  convergence  proofs 
for  the  method  described  in  this  paper.  There  are  many  papers  addressing  the  fourth 
issue  in  the  statistics  literature,  e.g.,  kernel  density  estimation. 

Our  main  goal  is  to  treat  the  effects  of  stochastic  variation  in  the  diffusion  co¬ 
efficient  A.  The  treatment  of  a  problem  in  which  just  the  right-hand  side  and  data 
vary  stochastically  is  somewhat  eaisier  because  there  is  just  one  differential  operator 
to  be  inverted.  When  the  elliptic  coefficient  varies  stochastically,  we  are  dealing  with 
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a  family  of  differential  operators.  We  include  a  brief  treatment  of  stochastic  variation 
in  the  right-hand  side  and  data  to  be  complete. 

1.1.  Some  notation.  The  notation  is  cumbersome  since  we  are  dealing  with  two 

discretizations:  solution  of  the  differential  equation  and  approximation  of  a  probability 
distribution  by  finite  sampling.  Generally,  capital  letters  denote  random  variables,  or 
samples,  and  lowercase  letters  represent  deterministic  variables  or  functions.  When 
this  assignment  is  violated,  we  use  italics  to  denote  deterministic  quantities.  We 
let  Q  C  R‘^,  d  =  2,3,  denote  the  piecewise  polygonal  computational  domain  with 
boundary  dQ.  For  an  arbitrary  domain  ui  C  fl  wc  denote  the  scalar  product 

by  =  J^vwdx  in  the  domain  and  {v,w)g^  =  Jq^vwcIs  on  the  boundary, 

with  associated  norms  ||  ||u  and  |  \ui-  When  w  =  ff,  we  drop  the  index  in  the  scalar 
products.  We  let  'H^(w)  denote  the  standard  Sobolev  space  of  smoothness  s  for  s  >  0. 
In  particular,  HKfl)  denotes  the  space  of  functions  in  H'(n)  for  which  the  trace  is  0 
on  the  boundary.  If  w  =  fJ,  we  drop  w,  and  also  if  s  =  0,  we  drop  s;  i.e.,  ||  •  ||  denotes 
the  L^(fI)-norm. 

We  assume  that  any  random  vector  X  is  associated  with  a  probability  space 
(A,S,  .P)  in  the  usual  way.  We  let  {X’^,n  =  1,...,A/’}  denote  a  collection  of  sam¬ 
ples.  We  assume  it  is  understood  how  to  draw  these  samples.  We  let  E{X)  denote 
the  expected  value,  Var(X)  denote  the  variance,  and  F{t)  =  P{X  <  t)  denote  the 
cumulative  distribution  function.  We  compute  approximate  cumulative  distribution 
functions  in  order  to  determine  the  probability  distribution  of  a  random  variable. 

1.2.  A  modeling  assumption.  The  first  step  in  developing  a  numerical  method 
for  the  density  estimation  problem  is  to  characterize  the  stochastic  nature  of  the 
random  variations  affecting  the  problem.  We  assume  that  the  stochastic  diffusion 
coefficient  can  be  written 

A  =  a  +  A, 

where  the  uniformly  coercive,  bounded  deterministic  function  a  may  have  multiscale 
behavior  and  A  describes  a  relatively  small  stochastic  perturbation.  Specifically,  we 
assume  that  a{x)  >  ao  >  0  for  i  S  fJ  and  that  |A(x)|  <  5a{x)  for  some  0  <  (5  <  1. 

As  a  modeling  assumption,  we  assume  that  d  is  a  piecewise  constant  function 
with  random  coefficients.  Specifically,  we  let  /C  be  a  finite  polygonal  partition  of  n, 
where  fJ  =  and  «i  and  k2  either  are  disjoint  or  intersect  only  along  a  common 

boundary  when  ki  ^  «2-  We  let  denote  the  characteristic  function  for  the  set 
K  S.  1C.  We  assume  that 

(1.2)  A(x)  =  ^  A''  x«{a^).  xeCl, 

k^K, 

where  (A'‘)  is  a  random  vector  and  each  coefficient  A''  is  associated  with  a  given 
probability  distribution.  We  illustrate  such  a  representation  in  Figure  1.1.  Improving 
the  model  under  this  assumption  requires  choosing  a  finer  partition  and  taking  more 
measurements  A'‘;  see  [5). 

Note  that  we  do  not  assume  that  the  coefficients  of  A  are  independent  and/or 
uncorrelated.  We  assume  only  that  it  is  possible  to  draw  samples  of  the  values.  We 
denote  a  finite  set  of  samples  by  {A"''‘,n  =  1, . . .  ,Af}.  There  are  a  few  situations  in 
which  this  is  reasonable;  e.g.,  see  the  following. 

•  There  may  be  a  component  of  the  diffusion  coefficient  and/or  its  error  that 
can  be  determined  experimentally  only  at  a  relatively  small  set  of  points  in 
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Fig.  1.1.  Illustration  of  the  modeling  assumption  (1.2).  The  unit  square  is  partitioned  into 
9x9  identical  squares.  The  diffusion  coefficient  a  is  0.1  on  the  “cross" -shaped  domain  centered  at 
the  origin  and  1  elsewhere.  The  random  perturbations  are  uniform  on  the  interval  determined  by 
±10%  of  the  value  of  a. 


the  domain  fl.  For  example,  consider  the  SPEIO  comparative  oil  re,servoir 
simulation  project  description  in  [6].  In  the  absence  of  more  information,  it  is 
natural  to  build  a  piecewise  constant  description  of  the  measured  information. 

•  We  may  assume  or  have  knowledge  of  global  distribution  governing  the  ran¬ 
dom  perturbation  and  simply  represent  realizations  of  the  perturbation  as 
piecewise  constant  functions  with  respect  to  a  given  domain  partition. 

We  show  below  that  assuming  a  spatially  localized,  piecewise  constant  description 
for  A  provides  the  possibility  of  devising  a  very  efficient  density  estimation  algorithm. 
In  [3],  we  treat  the  situation  in  which  A  is  a  piecewise  polynomial  function,  and 
in  particular  A  may  be  continuous.  An  alternative,  powerful  approach  to  describe 
random  behavior  is  based  on  the  use  of  Karhunen-Loeve,  polynomial  chaos,  or  other 
orthogonal  expansions  of  the  random  vectors  [7,  1],  which  provides  a  spatially  global 
representation.  However,  this  approach  requires  detailed  knowledge  of  the  probability 
distributions  for  the  input  variables  that  is  often  not  available. 

2.  The  case  of  a  randomly  perturbed  diffusion  coefficient.  We  begin  by 
studying  the  Poisson  equation  with  a  randomly  perturbed  diffusion  coefficient.  We 
let  U  £  7-fo(fI)  (a  s.)  solve 


(2.1) 


-V  ■  AVU  =  /,  I  €  n, 

1/  =  0,  X  in  dQ, 


where  /  €  L^(n)  is  a  given  deterministic  function  and  A  =  a  -t-  A  satisfies  the  condi¬ 
tions  described  in  section  1.2.  We  construct  an  efficient  numerical  method  for  com¬ 
puting  sample  solutions  and  then  provide  an  a  posteriori  analysis  of  the  error  of  the 
method. 

2.1.  More  notation.  We  use  the  finite  element  method  to  compute  numerical 
solutions.  First,  some  general  notation  is  as  follows;  Let  7),  =  {t}  be  a  quasi-uniform 
partition  into  elements  that  Ur  =  fl.  Associated  with  7/,,  we  define  the  discrete  finite 
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element  space  V*  consisting  of  continuous,  piecewise  linear  functions  on  T  satisfying 
Dirichlet  boundary  conditions,  with  mesh  size  function  hr  =  diam(r)  for  z  G  t  and 
h  =  maxT6Tfc/iT-  In  some  situations,  we  use  a  more  accurate  finite  element  space 
V;,  either  comprising  the  space  of  continuous,  piecewise  quadratic  functions  or 
involving  a  refinement  7^  of  7^,  where  h  -C  /i. 

Our  approach  uses  Lion’s  nonoverlapping  domain  decomposition  method  [9,  8]. 
Again,  some  general  notation  is  as  follows:  We  let  {Hd,  d  =  I,. . .  ,V)  be  a  decompo¬ 
sition  of  n  into  a  finite  set  of  nonoverlapping  polygonal  subdomains  with  Uf2d  =  0- 
We  denote  the  boundaries  by  dQd  and  outward  normals  by  rid.  For  a  function  A"  on 
n,  A"’*^  means  A"  restricted  to  For  d  =  I, . . .  ,V,  d'  denotes  the  set  of  indices  in 
{1, 2, .  \  {d}  for  which  the  corresponding  domains  firf/  share  a  common  bound¬ 

ary  with  fid.  The  method  is  iterative,  so  for  a  function  U  involved  in  the  iteration, 
Lit  denotes  the  value  at  the  ith  iteration.  Let  A"  =  a  -f  A”  be  a  particular  sample  of 
the  diffusion  coefficient  with  corresponding  solution  f/".  We  let  d  =  1, . . . ,  P} 

denote  a  set  of  initial  guesses  for  solutions  in  the  subdomains. 

2.2.  The  computational  method.  Returning  to  (2.1),  we  assume  that  the 
finite  element  discretization  Th  is  obtained  by  refinement  of  K.  associated  with  the 
modeling  assumption  (1.2)  on  A.  This  is  natural  when  the  diffusion  coefficient  a  and 
the  data  vary  on  a  scale  finer  than  the  partition  1C. 

Given  the  initial  conditions,  for  each  i  >  1,  we  numerically  solve  the  V  problems 

i  =  0,  z  €  and  n  an, 

{ +  Ud  ■  A”V[/"’‘'  =  -  nj  •  A" v(/fdf ,  z  €  and  n  anj,  d  e  d', 

where  the  parameter  A  €  R  is  chosen  to  minimize  the  number  of  iterations.  In  prac¬ 
tice,  we  compute  I  iterations.  Note  that  the  problems  can  be  solved  independently. 

To  discretize,  we  let  Vh,d  C  'Hl  gQ{Cld)  be  a  finite  element  approximation  space 
corresponding  to  the  mesh  7d  on  nd,  where 

'^o.ani^d]  =  {w  e  7f*(nd) ;  vlanonan  =  0}. 

We  let  (•,  ■)d  denote  the  L^{Cld)  scalar  product,  (•,  •)d  denote  the  L^[dCld)  scalar 
product,  and  (•,  •)dnd  denote  the  L^(dCld  H  an^)  scalar  product  for  d  G  d'.  The  first 
two  inner  products  are  associated  with  norms  ||  ||d  and  |  |d,  respectively.  For  each 
2  >  1,  we  compute  G  V^.d,  d  =  1, . . .  ,P,  solving 

(2.2)  (A"V[/"'‘',Vu)d  +  ^(l/;‘-‘',u), 

=  (/.  v)<i  +  E  ^)dn.-  -  ^)dnj)  all  ^  e  ^>^4-  ' 

It  is  convenient  to  use  the  matrix  form  of  (2.2).  We  let  m  =  1, . . . ,  rid}  be 
the  finite  element  basis  functions  for  the  space  V/,,d,  d  =  1, . . . ,  P.  We  let  (7"’“^  denote 
the  vector  of  basis  coefficients  of  with  respect  to  {v’m}-  On  domain  fldi 

(ko,d  ^  ^  ^  6"’‘'(A",t/;!:f ), 
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where 

=  (aV(pf,  V(p^)d  + 

{h%  =  {LA)<i, 

Jed'  ^  ' 

for  1  <  i  and  A:  <  n^.  We  abuse  notation  mildly  by  denoting  the  dependence  of  the 
data  6"’'^ (A",  )  on  the  values  of  t/Jlij  for  d  S  d' .  We  summarize  this  approach  in 

Algorithm  1. 


Algorithm  1. 

Monte  Carlo  domain  decomposition  finite  element  method 

for  n  =  1, . . 

.  ,A/’  (number  of  samples)  do 

for  t  =  1 , . 

. . ,  J  (number  of  iterations)  do 

for  d  — 

1, . . . ,  Z3  (number  of  domains)  do 

Solve 

Solve 

(ka,d  +  k"A)C/"’‘'  =  b’^U)  +  b"’‘'(A",t/"jf ). 

end  for 

end  for 

end  for 

Unfortunately,  this  algorithm  is  expensive  for  a  large  number  of  realizations  since 
each  solution  t/”  requires  the  solution  of  a  discrete  set  of  equations.  To  construct  a 
more  efficient  method,  we  impose  a  restriction  on  the  domains  in  the  decomposition. 
We  assume  that  each  domain  ild  is  contained  in  a  domain  k  in  the  partition  K.  used 
in  the  modeling  assumption  (1.2).  This  implies  that  the  random  perturbation  jg 
constant  on  each  he.,  it  is  a  random  number.  Consequently,  the  matrix  k"’"^  has 
coefficients 


,  V^i)d  =  A’^A^V.pf,  =  A^’'^(k'^)tk, 

where  k"^  is  the  standard  .stiffne.ss  matrix  with  coefficients  (k'*);*;  =  ,  V(/?^)d.  We 

now  use  the  fact  that  jg  relatively  small  to  motivate  the  introduction  of  the 
Neumann  series.  Formally,  the  Neumann  series  for  the  inverse  of  a  perturbation  of 
the  identity  matrix  provides 

(ka,d^^n,dkd)-i  ^  (k“A(id  +  A’‘'''(k“A)-ik‘^))-' 

=  (id  +  A'‘’‘'(k“'‘')-ik‘')~^  (k“A)-* 

OO 

=  ^(-A"A)P((k“’‘')"*k‘^)'’(k“A)-i, 

p=0 

where  id  is  the  identity  matrix.  We  compute  only  V  terms  in  the  Neumann  expansion 
to  generate  the  approximation 


(2.3)  u;,j  = 

p=0 


)P((ka,d)-lkd)P)  J,, 


hn.di 


.U- 


Uyd! 


U))- 
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We  discuss  the  convergence  as  P  — »  oo  in  detail  below. 

Note  that  is  nonzero  only  at  boundary  nodes.  If  VV^  d  denotes  the  set  of 
vectors  determined  by  the  finite  element  basis  functions  associated  with  the  boundary 
nodes  on  fid,  then  is  in  the  span  of  Wh,d-  We  can  precompute 

efficiently,  e.g.,  using  Gaussian  elimination.  This  computation  is  independent  of  n. 


Algorithm  2.  Monte  Carlo  domain  decomposition  finite  element  method 

USING  A  TRUNCATED  NEUMANN  SERIES _ 

for  d  =  I, . . . ,  D  (number  of  domains)  do 
for  p  =  I,. ..  ,P  (number  of  terms)  do 
Compute  i/=  ((k“'‘')-*k‘')'’(k“'‘')-'6‘'(/) 

Compute  y'’  =  ((k"’'^)-ik'')^(k'‘''')-iWh.d 
end  for 
end  for 

for  i  =  1, . . . ,  J  (number  of  iterations)  do 
for  d  =  I, . . . ,  D  (number  of  domains)  do 
for  p  =  1, . . . ,  T’  (number  of  terms)  do 
for  n  =  1, . . .  .-/V  (number  of  samples)  do 
Compute  =  Ep='o'(-yl"-‘')P(yP6"’‘'(A",  +  y) 

end  for 
end  for 
end  for 
end  for 


Combining  this  with  Algorithin  1,  we  obtain  the  computational  method  given  in 
Algorithm  2.  We  let  denote  the  finite  element  functions  determined  by  for 

n  =  1, . . . ,  A/"  and  d=  1, . V.  We  let  j  denote  the  finite  element  function  which 

is  equal  to  on  Hd- 

Remark  2.1.  Note  that  the  number  of  linear  systems  that  have  to  be  solved  in 
Algorithm  2  is  independent  of  Af.  Hence,  there  is  potential  for  enormous  savings  when 
the  number  of  samples  is  large. 

2.3.  Convergence  of  the  Neumann  series  approximation.  It  is  crucial 
for  the  method  that  the  Neumann  series  converges.  The  following  theorem  shows 
convergence  under  the  reasonable  assumption  that  the  random  perturbations  to 
the  diffusion  coefficient  are  smaller  than  the  coefficient.  We  let  IHulHj  =  ||Vu||^  +  £|u|d 
for  some  e  >  0.  We  define  the  matrices  =  — A"’‘^(k“’‘*)“^k‘^  and  denote  the 
corresponding  operators  on  the  finite  element  spaces  by  ;  V^^d  Vh.,d- 

Theorem  2.1.  //p  =  |  max{/l'‘'‘'}/ao|  <1,  then 


oo 


(2.4) 

(a) 

(id -'  =  ^(c"’'')'’, 

p=0 

/  \ 

V-l 

(2.5) 

(b) 

\  p=0  ^ 

<  '  -n 

d  1 

E  ('”■') 

p=0 

for  any  v  e  Vh,d- 


2942 


D.  ESTEP,  A.  MALQVIST,  AND  S.  TAVENER 


Proof.  Let  2  =  for  an  arbitrary  w  G  Vh.d-  From  the  definition  of 

2  G  Vk,d  satisfies 

(2.6)  (aVz,  Vv)d  +  ^(z.  v)d  =  Vv)d 

for  all  V  G  Vfi^d-  Choosing  w  =  z  in  (2.6)  and  using  the  Cauchy-Schwarz  inequality 
yields 

Choosing  e  <  2/(Aao)  in  the  definition  of  the  norm  |||u|||^  =  ||Vu||^  +  e|?;|^  and  making 
standard  estimates  gives 

\M\l<rnVw\\l<rh\H\\l 


By  induction, 

(2.7)  |||(c"’'')"t«||L<7/'’l|HII<f. 

In  particular,  (c"’‘^)^  — +  0  as  p  — +  oo. 

We  take  the  limit  as  V  tends  to  infinity  in  the  identity 

p-i 

id  -  (c"''')^  =  (id  -  c"-'')  (c"’'')'' 

p=0 


to  obtain  (2.4). 

In  order  to  prove  (b)  we  note  that 


(id-c"-'')”^  -  =  £  (c"’*')'’ =  (c"’‘')’’(id  -  0”’“^)  V 

p=0  p='P 


In  the  finite  element  context,  for  v  G  V^.d, 


v-i 


p=0 


„n,d\ ^ 


)^(i-o-MiL<p^iii(i-c-r^H 


<r 


V-l 


p=0 


+  r 


■p-i 


p=0 


The  theorem  follows  immediately.  □ 


2.4.  A  numerical  example.  We  present  a  numerical  example  that  illustrates 
the  convergence  properties  of  the  proposed  method.  Below,  we  derive  an  a  posteri¬ 
ori  estimate  for  the  contributions  to  the  error  of  the  approximation  and  develop  an 
adaptive  algorithm  that  provides  the  means  to  balance  these  contributions  efficiently. 

In  this  example,  we  partition  the  unit  square  into  9x9  equal  square  subdomains 
for  the  domain  decomposition  algorithm.  The  coefficient  A"  =  a  +  /I",  where  a  and 
A"  are  piecewise  constant  on  the  9x9  .subdomains.  The  diffusion  coefficient  a  is  equal 
to  1  except  on  a  cross  in  the  center  of  the  domain  fl  where  it  is  equal  to  0.1.  The 
random  perturbations  are  uniform  on  the  interval  determined  by  ±10%  of  the  value 
of  a.  We  illustrate  a  typical  sample  in  Figure  1.1.  The  data  is  /  =  1.  We  estimate 
the  error  in  the  quantity  of  interest  which  is  the  average  of  the  solution. 
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Fig.  2.1.  Left:  convergence  in  mesh  size  as  the  numbers  of  domain  decomposition  iterations 
and  terms  in  the  Neumann  series  are  held  constant.  Middle:  convergence  with  respect  to  the  number 
of  domain  decomposition  iterations  as  the  mesh  size  and  number  of  terms  in  the  Neumann  series 
are  held  constant.  Right:  convergence  with  respect  to  the  number  of  terms  in  the  Neumann  series 
as  the  mesh  size  and  number  of  domain  decomposition  iterations  are  held  constant. 


2.4.1.  Convergence  for  a  single  realization.  First,  we  consider  a  single  sam¬ 

ple  A"  in  order  to  focus  on  the  convergence  with  respect  to  mesh  size,  number  of 
iterations,  and  number  of  terms.  In  particular,  we  study  how  the  accuracy  of  a  com¬ 
puted  linear  functional  of  the  solution  V')  depends  on  the  three  parameters  h, 

V,  and  J.  To  compute  approximate  errors,  we  use  a  reference  solution  with  h  =  1/72, 
I  =  300,  and  V  =  5. 

We  start  by  letting  I  —  300  and  P  =  5,  and  let  the  number  of  elements  in  each 
direction  (l//i)  vary  from  18  to  72;  i.e.,  the  total  number  of  nodes  in  the  mesh  varies 
from  (18  +  1)^  =  361  to  (72  -f-  1)^  =  5329.  In  Figure  2.1,  we  plot  the  relative  error 
as  the  mesh  size  decreases.  Next,  we  fix  l/h  =  72,  keep  V  =  b,  and  vary  I  from  10 
to  300.  In  Figure  2.1,  we  plot  the  relative  error  as  the  number  of  iterations  increase. 
Finally,  we  let  l//i  =  72, 1  =  300,  and  V  vary  from  1,  3,  and  5.  Here,  the  reference 
solution  is  computed  using  V  =  7.  We  plot  the  results  in  Figure  2.1. 

2.4.2.  Convergence  with  respect  to  number  of  samples.  Next,  we  fix  the 

spatial  discretization  and  experimentally  investigate  the  accuracy  in  the  cumulative 
distribution  function  as  a  function  of  the  number  of  samples.  Following  the  problem 
in  section  2.4.1,  we  fix  h  =  1/72,  I  =  300,  and  V  =  b,  vary  N  from  30  to  480,  and 
compute  the  cumulative  distribution  function  We  present  the  result  in  Fig¬ 

ure  2.2.  We  observe  that  the  distribution  function  becomes  smoother  eis  the  number 
of  samples  increases  and  appears  to  converge. 

To  approximate  the  error  £is  the  samples  increase,  we  compute  a  reference  solution 
using  A/"  =  480.  We  plot  the  errors  in  Figure  2.3.  The  error  decreases  significantly 
between  AA  =  30  and  AA  =  240,  but  the  convergence  is  fairly  slow. 


2.5.  A  posteriori  error  analysis  of  sample  values.  We  next  derive  an  a 
posteriori  error  estimate  for  each  sample  linear  functional  value  ([/",V')  [4,  2].  We 
introduce  a  corresponding  adjoint  problem 


(2.8) 


— V  AV$  =  ip,  X  €  fl, 
$  =  0,  X  e  dQ. 


We  compute  AA  sample  solutions  {$",n  =  1,...,AA}  of  (2.8)  corresponding  to  the 
samples  {A",n  =  1, . . .  ,AA}. 
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Fig.  2.2.  Convergence  in  the  number  of  samples  as  the  mesh,  the  number  of  domain  decompO‘ 
sition  iterations,  and  the  number  of  terms  in  the  Neumann  series  am  held  constant.  Plots  from  left 
to  right,  top  to  bottom,  Af  =  30,60, 120,480. 


o 


i 

a 


Fig.  2.3.  Error  in  the  distribution  function  compared  to  a  reference  solution  as  the  mesh,  the 
number  of  iterations,  and  the  number  of  terms  are  held  constant.  Plots  from  left  to  right,  top  to 
bottom,  Af  =  30, 60, 120, 240. 


To  obtain  computable  estimates,  we  compute  numerical  solutions  of  (2.8)  using 
Algorithm  2.  We  obtain  numerical  approximations  1.  •  •  ■ .  2?}  using  a  more 

accurate  finite  element  discretization  computed  using  either  the  space  of  continuous, 
piecewise,  quadratic  functions  or  a  refinement  of  T/,,  where  h  h.  We  denote 
the  approximation  on  by 
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Theorem  2.2.  For  each  n  e  1, . . . 

(2.9)  I  [u-  -  (/^ ,1.  V')  I  ^  I  (/.  j)  -  I  +  n{h,  v,  i) , 

where 


n{h 


V? 


Tf^  is  a  refinement  of  T^. 


Proof  With  $  solving  (2.8),  the  standard  Green’s  function  argument  yields  the 
representation 

(y"  -  f/^,j,^)  =  (/,$")  -  (A"viy^.i,  V4>"). 


We  write  this  as 


(y"  -  j)  -  (A"viy^‘,i. 

+  ((/.<!>"-  -  (A"viy^,i,  v($"  -  $^^))) 

and  define 

(2.10)  7^(/l,iP,i)  =  (/.$"  -  -  (A"Vfy^,i.  V($"  -  j)). 

We  introduce  auxiliary  adjoint  problems  for  the  purpose  of  analysis.  Let  T"  €  V 
solve 

(2.11)  (A"VT",  Vu)  =  {f,v)  -  (A"V(/^  x,u)  for  all  u  €  V, 

corresponding  to  the  quantity  of  interest  (/,  $)  —  (A”V(/.p  2,  4>").  The  standard 
Green’s  function  argument  yields 

-  (A"vy^,2.'&”  -  (A”VT",V$^  2)- 

Using  j.  to  denote  the  approximate  solution  obtained  by  using  the  complete  Neu¬ 
mann  series  (which  is  equivalent  to  finding  the  solution  of  the  problem  with  the  full 
diffusion  coefficient),  we  have 

(2.12)  (/,  -  (A"Vf/^,2.  =  (T".  -  (A"  VT",  V$^^) 

-f(A"VT",V($J^-4>^2))- 

By  Theorem  2.1,  the  third  term  on  the  right-hand  side  can  be  made  arbitrarily  small 
by  taking  V  large.  We  can  use  Galerkin  orthogonality  on  the  first  two  terms  on  the 
right-hand  side  of  (2.12)  by  introducing  a  projection  tt^  into  Decomposing  into  a 
sum  of  integrals  over  elements  and  integrating  by  parts  on  each  element,  we  have 

(T",^)  -  (A"VT",  V4>^  j)  =  (T^-tt^T’S-^)-  (A"V(T’'-7r;^T"),V<I>;,^^) 

=  E  f  ^)r  +  V  ■  A"V4>^  ^)^ 

where  dn  denotes  the  normal  derivative  to  dr.  The  standard  argument  involving 
interpolation  estimates  now  yields  the  bounds  in  (2.10).  □ 
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2.5.1.  Numerical  example.  We  present  a  brief  example  illustrating  the  accu¬ 
racy  of  the  a  posteriori  estimate  (2.9).  We  consider  just  one  sample  diffusion  value 
=  o  -I-  i4^,  with  a  =  0.9  and  =0.1  on  the  unit  square.  We  compute  the  error 
in  the  average  value  by  choosing  rp  =  1.  We  set  /  =  2  •  3;(1  —  x)  +  2  ■  y{\  —  y)  so  that 
the  exact  solution  is  =  a:(l  —  x)  -1/(1  —y)  and  =  1/36. 

We  divide  the  computational  domain  into  8x8  equally  sized  squares  on  which 
we  compute  the  numerical  approximation  to  f/*  using  Lion’s  domain  decomposition 
algorithm  with  T  iterations  using  the  approximate  local  solver  involving  a  truncated 
Neumann  series  of  T’  terms.  We  let  /i  =  1/32  so  that  each  subdomain  has  discretization 
5x5  nodes.  To  solve  the  adjoint  problem,  we  use  a  refined  mesh  with  h  =  h/2  and 
use  7'P  terms  in  the  truncated  Neumann  series  and  'yX  iterations  in  the  domain 
decomposition  algorithm,  where  7  >  0.  To  evaluate  the  accuracy  of  the  estimate,  we 
use  the  efficiency  index 

- • 

We  start  by  letting  7  =  2;  i.e.,  we  put  a  lot  of  effort  into  solving  the  adjoint 
solution.  We  present  results  for  varying  X  and  V  in  Table  2.1. 


Table  2.1 

Efficiency  index  results  for  7  =  2. 


V 

I 

Ratio 

V 

I 

Ratio 

1 

50 

0.992 

3 

10 

2.78 

2 

50 

1.03 

3 

25 

1.02 

3 

50 

1.01 

3 

50 

1.01 

Next,  we  let  7  =  0.5;  i.e.,  we  use  much  poorer  resolution  for  the  adjoint  solution. 
We  plot  the  results  in  Table  2.2. 


Table  2.2 

Efficiency  index  results  for  7  =  .5. 


V 

J 

Ratio 

V 

I 

Ratio 

1 

50 

0.908 

3 

10 

8.21 

2 

50 

1.00 

3 

25 

1.08 

3 

50 

0.933 

3 

50 

0.933 

The  efficiency  indexes  are  close  to  one  except  when  the  number  of  domain  de¬ 
composition  iterations  for  the  adjoint  problem  is  very  low.  In  general,  it  appears 
that  as  long  as  the  number  of  domain  decomposition  iterations  is  sufficiently  large, 
the  adjoint  problem  can  be  solved  with  rather  poor  resolution,  yet  we  still  obtain  a 
reasonably  accurate  error  estimate. 

3.  The  case  of  random  perturbation  in  data.  For  the  sake  of  completeness, 
we  treat  the  case  in  which  the  data  C  in  (1.1)  is  randomly  perturbed.  It  is  straight¬ 
forward  to  combine  the  ceises  of  randomly  perturbed  diffusion  coefficient  and  data. 
We  present  a  fast  method  for  computing  samples  of  a  linear  functional  of  the  solution 
given  samples  of  the  right-hand  side  data.  It  is  straightforward  to  deal  with  a  more 
general  elliptic  operator,  so  we  let  U  €  '^□(11)  (a.s.)  solve 

(3.1)  a{U,v)  =  {G{x),v),  V 

where 

a{w,  v)  =  (aVw,  Vv)  -f-  (6  •  Vw,  v)  -I-  {cw,  v) 
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for  WjV  &  Hq{Q),  G{x)  e  L^{Q.)  (a.s.),  G(-)  has  a  continuous  and  bounded  covariance 
function,  and  a,b,  and  c  are  deterministic  functions  chosen  such  that  (3.1)  has  a 
unique  weak  solution  in  TIq(CI).  In  particular,  a{x)  >  oq  >  0  for  all  x.  We  let 
{G"(a;),  n  =  1, . . .  ,J\f}  denote  a  finite  collection  of  samples. 

3.1.  Computational  method.  In  the  case  of  randomly  perturbed  right-hand 
side  and  data,  we  can  use  the  method  of  Green’s  functions  to  construct  an  efficient 
method  for  density  estimation.  We  introduce  a  deterministic  adjoint  problem.  We 
let  the  quantity  of  interest  be  a  linear  functional  Q{v)  =  {v,  ip)  determined  by  a  func¬ 
tion  Ip  e  L^(n)  and  construct  the  corresponding  adjoint  problem  for  the  generalized 
Green’s  function  (p  £  Ho(n), 

(3.2)  a'{(p,v)  =  {■tp,v),  venlin), 

where 


a’(w,v)  =  (VvjjVv)  -  (V  •  {bw),v)  -t-  (cw,v). 

It  immediately  follows  that 

iU-,iP)  =  a*(0,[/")  =  a(i;",0)  =  (G",<^) 
for  n  =  1, . . . ,  A/".  By  linearity,  we  see  that  E(U)  6  'Hoiil)  solves 
a{E{U),v)  =  {E{C^),V),  venlin), 

and  we  can  obtain  an  analogous  representation.  Wc  conclude  that  the  cleissic  Green’s 
representation  holds. 

Theorem  3.1.  For  samples  {G",n=  1,...,A/’},  we  have 

(3.3)  {U^,iP)  =  {G^A) 

Jot  71  =  1, . . . ,  jV".  IVe  also  have 

(3.4)  E{{U,iP))  =  {E{G),4>)- 

The  point  is  that,  theoretically,  instead  of  solving  a  partial  differential  equation 
for  each  sample  in  order  to  build  the  distribution  of  {U’',7p),  we  can  solve  one  de¬ 
terministic  problem  to  get  <p  and  then  calculate  values  of  (t/",  V")  using  a  relatively 
inexpensive  inner  product.  Indeed,  we  never  approximate  I/"  in  order  to  estimate  the 
samples  of  the  quantity  of  interest  in  this  approach. 

In  order  to  make  this  approach  practical,  we  introduce  a  finite  element  approxi¬ 
mation  d'h  e  V/i  satisfying 

(3.5)  a’{cph,v)  =  {7p,v),  u  G  V/,. 

We  obtain  the  computable  approximations 

(3.6)  {U^\i7)^{<pH,G^), 

(3.7)  E[[U,iP))7^{E{G),<Ph). 
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3.2.  A  posteriori  error  estimate  for  samples.  We  next  present  an  a  posteri¬ 
ori  error  analysis  for  the  approximate  value  for  each  sample  (t/”,  i/")  and  for  tp)) . 
For  samples,  the  error  is 

-  {G'^Ah)  =  {G^A)  -  {G'^,4>h)  =  -  <A)- 

For  each  sample,  this  is  a  quantity  of  interest  for  (j)  corresponding  to  G".  To  avoid 
confusion,  we  let  ©”  6  Ho(f2)  denote  the  forward  adjoint  solution  solving 

(3.8)  a(0",u)  =  (G",i;),  v  6 

Note  that  because  we  treat  a  linear  problem,  ©”  =  t/”.  Similarly,  we  let  E(©)  6 
Hoifl)  solve 

(3.9)  a{E{G),v)  =  {E{G),v),  u  e  Hj(n). 

The  standard  analysis  gives 

(G",  4>h-<t>)  =  fl(e",  4>h-4>)=  a*{4,h  -  4>,  0") 

=  a*(0,„  ©")  -  a*(0,  ©")  =  a'(«ih,  ©")  -  (t/;,  ©") 

=  a*i<Ph,  (/  -  TTh)©")  -  (ip,  (/  -  7r„)©"), 

where  the  last  step  follows  from  Galerkin  orthogonality.  We  can  argue  similarly  for 
G((G",0h-0)). 

To  use  this  representation,  we  need  to  solve  the  forward  adjoint  problem  (3.8)  in 
V;^.  Unfortunately,  in  the  case  of  the  error  of  the  samples,  this  requires  computing 
n  approximate  forward  adjoint  solutions  ©”  using  a  more  expensive  finite  element 
computation.  Another  approach  is  simply  to  approximate  (G",  4>)  by  (G",  where 
is  a  finite  element  approximation  of  0  in  V^.  An  analysis  of  the  accuracy  of  this 
replacement  follows  easily  from  the  relation 

(G",  <Pi,  -  0)  =  (G",  <Ph  -  <Pk)  +  4  -  <^)' 

In  either  case,  arguing  as  for  Theorem  2.2  yields  the  following. 

Theorem  3.2.  The  solution  error  in  each  sample  is  estimated  by 

(3.10)  (G",  cPu-cP)^  a'{4>h,  (/  -  TT/.)©")  -  {rp,  (I  -  TTh)©"), 

where  ©"  is  a  finite  element  approximation  for  the  adjoint  problem  (3.8)  in  V/,  and 
Tih  denotes  a  projection  into  the  finite  element  space  V*.  We  also  have 

(3.11)  {G^,cPh-<P)^{G^,<Ph-<PkI 

where  (ph  is  a  finite  element  solution  of  the  adjoint  problem  (3.2)  computed  in  V^. 

We  also  have  the  estimates 

(3.12)  £((G",  -  cP))  ~  a-{<Ph,  (/  -  nh)E{e))  -  (V-,  (/  -  7rh)£(©)), 

where  E{Q)  is  a  finite  element  approximation  for  the  adjoint  problem  (3.9)  computed 
on  a  finer  mesh  Tf^  or  using  V^.  We  also  have 

(3.13)  E((G,  cPh  -  P))  ~  (G(G),  (Ph  -  M- 

The  error  of  these  estimates  is  bounded  by  Gh^  or  Gh^.  In  both  cases,  these 
bounds  are  asymptotically  smaller  than  the  estimates  themselves. 
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4.  A  posteriori  error  analysis  for  an  approximate  distribution.  We  now 
present  an  a  posteriori  error  analysis  for  the  approximate  cumulative  distribution  func¬ 
tion  obtained  from  N  approximate  sample  values  of  a  linear  functional  of  a  solution 
of  a  partial  differential  equation  with  random  perturbations. 

We  let  t/  =  U{X)  he  a.  solution  of  an  elliptic  problem  that  is  randomly  perturbed 
by  a  random  variable  X  on  a  probability  .space  and  Q{U)  =  {Lf,ip)  be  a 

quantity  of  interest  for  some  ip  €  We  want  to  approximate  the  probability 

distribution  function  of  Q  =  <3(A’), 

F{1)  =  P[{X  :  Q{U{X))  <  0)  =  P{Q  <  1). 

We  use  the  sample  distribution  function  computed  from  a  finite  collection  of  approx¬ 
imate  sample  values  {Q", n  =  1, . . .  ,M]  =  [{Un,i)),n  =  1, . . . , A"}: 

n=l 


where  /  is  the  indicator  function.  Here,  [/"  is  a  numerical  approximation  for  a  true 
solution  [/"  corresponding  to  a  sample  A".  We  assume  that  there  is  an  error  estimate 

with  Q'^  =  (t/",  V')'  We  use  Theorem  2.2  or  3.2,  for  example. 

There  are  two  sources  of  error: 

1.  finite  sampling, 

2.  numerical  approximation  of  the  differential  equation  solutions. 

We  define  the  sample  distribution  function 

n=l 


and  decompose  the  error 

(4.1)  \F{t.)  -  Fj^{t)\  <  |F(t)  -  F^(f.)l  +  \FM{t.)  -  FatCOI  =I  +  IF 

There  is  extensive  statistics  literature  treating  /j  e.g.,  see  [10].  We  note  that  F//- 
has  very  desirable  properties;  e.g.,  see  the  following. 

•  As  a  function  of  t,  F/j-(t)  is  a  distribution  function,  and  for  each  fixed  t,  Fj>/{t) 
is  a  random  variable  corresponding  to  the  sample. 

•  It  is  an  unbiased  estimator,  i.e.,  E{Fj^)  =  E{F). 

•  NFj^{t)  has  exact  binomial  distribution  for  Af  trials  and  probability  of  success 

F(0. 

•  Var(Fv(f))  =  F{t){l  —  F{l))/N  — *  0  as  Af  — ►  oo,  and  fV  converges  in  mean 
square  to  F  as  A/"  — »  oo. 

The  approximation  properties  of  F/zlt)  can  be  studied  in  various  ways,  all  of 
which  have  the  flavor  of  bounding  the  error  with  high  probability  in  the  limit  of  large 
M.  One  useful  measure  is  the  Kolmogorov-Smirnov  distance 
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A  result  that  is  useful  for  being  uniform  in  t  is  that  there  is  a  constant  C  >  0  such 
that 


p(  sup  \  Fj^{t)  —  F(t)|  >  £  J  <  Ce  for  all  e  >  0; 

\  teE  / 

see  [10].  We  rewrite  this  as  for  any  e  >  0, 


(4.2) 


sup  I 
teE 


with  probability  greater  than  1  —  e. 

Another  standard  measure  is  the  mean  square  error  (MSE), 

MSE{Q)  =  E{{Q-Qf), 

where  ©  is  an  estimator  for  0.  We  define 


Xn{t)  = 


We  have 


For  all  t, 


(4.3) 


1,  Qn<t, 

0,  otherwise, 


A/ 


X{t) 


-{o: 


Q<t, 

otherwise. 


n=l 


n=l 


2 

Af’ 


We  can  also  estimate  the  (unknown)  variance  by  defining 


is  a  computable  estimator  for  cr^  and 
(4.4)  MSE[Sl)  = 

Another  useful  result  follows  from  the  observation  that  {Xn}  are  independently 
and  identically  distributed  Bernoulli  variables.  The  Chebyshev  inequality  implies  that 
for  e  >  0, 


(4.5) 


E{X){t) 


N 


N 


1/2 


>  1 


£  >  0. 


To  obtain  a  computable  estimate,  we  note  that 

F(f)(l  -  F{t))  =  FAf(t)(l  -  FAf(f))  +  {F(t)  -  F^(0)(1  -  F[t)  -  F^(f)). 
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Therefore  using  (4.5)  along  with  the  fact  that  0  <  F  and  F//  <  1, 

fF^m-Fuit))y^\  1 

-  V  AAe  J  2Me' 

We  conclude 
(4.6) 


1/2 


U : 

Next,  we  consider  II  in  (4.1) 

^  f;  (/(Q"  <  0  -  nQ  <  o) 


£  >  0. 


II  = 


^^(/(Q"  +  £"<0-/(Q<0) 


M 

We  estimate 
(4.7) 


n~l  I  r  ’  n=l 

,  M  ,  M 

^  i  <  Q"  + 1^:"!))  +  ^  E  ^  ‘  ^  Q")) 


nsl 

£«<0 


II  < 


1  ^ 


n=l 


1 

^E(^(‘5"-in<t<<?"+n)) 


If  instead  we  expand  using  we  obtain  the  computable  estimate 

M 

(4.8)  lFAr(0-^Ar(0|  < 

Setting  £  =  max£"  in  (4.7),  we  obtain 

(4.9)  |nr(0  -  F^(0l  <  \Fu{t+S)  -  FmH  -  £)[ 

Now 


\FM{i  +  £)  -  FM{t  -  S)\  <  \F{t.  +  £)-Fit-£)\ 

+  \F{t  +£)-  Fu{t  +  £)\  +  \F(t  -£)-  Fu{t  -  £)\. 

Using  (4.2),  for  any  c  >  0, 

|f(I±£)-FA£(t±£)|<  ' 

with  probability  greater  than  1  -  e.  Therefore,  for  any  e  >  0, 

(4.10)  \FM(t)-FM{t)\<\F{t  +  £)-F{t-£)\+2(^-^^P-'^  ' 

with  probability  greater  than  1  —  c. 
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Note  that 

Fwmi  -  FM{t))  =  F^mi  -  Fuit))  +  (Fuit)  -  Fuitm  -  FMit)  -  fMt))- 

We  can  bound  the  second  expression  on  the  right-hand  side  using  (4.9)  or  (4.10)  to 
obtain  a  computable  estimator  for  the  variance  of  F. 

We  summarize  the  most  useful  results. 

Theorem  4.1.  For  any  e  >  0, 


(4.11)  \F(t)-FMit)\ 


^  (  FM{t){l  -  FM(t)) 

—  I  k  f 


Me 


1/2 


-t-2 


1  ^ 

^  E  (^(‘5"  -  \£^\  <<<(?"  +  |f"|)) 

n=l 


+ 


2Mc 


with  probability  greater  than  1  —  e. 

With  L  denoting  the  Lipschitz  constant  of  F,  for  any  £  >  0, 


(4.12) 


\F{t)  -  FMt)\  <  ) 


1/2 


+  L  max  f "  +  2 

\<n<M 


/log(g~^) 

2M 


-ll\'/2 


with  probability  greater  than  I  —  t. 

Remark  4.1.  The  leading  order  bounding  terms  in  the  a  posteriori  bound  (4.11) 
are  computable,  while  the  remainder  tends  to  zero  more  rapidly  in  the  limit  of  large 
M.  The  bound  (4.12)  is  useful  for  the  design  of  adaptive  algorithms  among  other 
things.  Assuming  that  the  solutions  of  the  elliptic  problems  are  in  it  indicates 
that  the  error  in  the  computed  distribution  function  is  bounded  by  an  expression  in 
which  the  leading  order  is  proportional  to 


1 


+  Lh'^ 


with  probability  1  —  £.  This  suggests  that,  in  order  to  balance  the  error  arising  from 
finite  sampling  against  the  error  in  each  computed  sample,  we  typically  should  choose 


This  presents  a  compelling  argument  for  seeking  efficient  ways  to  compute  samples 
and  control  the  accuracy. 

Remark  4.2.  The  expression 

1  ^ 

^  ^  (/(<?" -|£"|<i<Q"  + in)) 

n=l 

is  itself  an  expected  value.  If  A4  <  A/"  and 

Af'  =  {ni  <  •  •  •  <  nM.) 

is  a  set  of  integers  chosen  at  random  from  {1, . . .  jA/”},  we  can  use  the  unbiased  esti¬ 
mator 


(4.13) 
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which  has  error  that  decreases  as  0(1/ \/^),  This  is  reasonable  when  TV"  is  large 
since  we  are  likely  to  require  less  accuracy  in  the  error  estimate  than  in  the  primary 
quantity  of  interest. 

Remark  4.3.  A  similar  error  analysis  can  be  carried  out  for  an  arbitrary  stochastic 
moment  q  with  an  unbiased  estimator  Q  using  A/  samples.  We  let  X  be  an  approxi¬ 
mation  to  X  and  decompose  the  error  as 

\q{X)  -  Q(W)|  <  19(A)  -  Q{X)\  +  \Q{X)  -  Q(X)\. 

The  first  term  can  be  estimated  using  the  Chebyshev  inequality,  for  £  >  0, 

P  |^|9(A)  -  Q(A)|  <  >  1  -  £. 

Since  the  variance  of  (3(A)  decreases  as  TV"  increases,  we  obtain  estimates  for  this  term 
analogous  to  the  expressions  above.  We  can  estimate  the  numerical  error  Q{X)—Q{X) 
by  computing  a  solution  on  a  finer  mesh  Q{X)  —  Q{X).  In  the  particular  case  that 
9(A)  =  ^'[A],  we  can  compute  the  quantity  very  efficiently;  see  section  3. 

4.1.  A  numerical  example.  We  illustrate  the  accuracy  of  the  computable 
bound  (4.11)  using  some  simple  experiments.  We  emphasize  that  (4.11)  is  a  bound, 
and,  in  particular,  we  trade  accuracy  in  terms  of  estimating  the  size  of  the  error  by 
increasing  the  probability  that  the  bound  is  larger  than  the  error.  In  this  case,  we 
desire  that  the  degree  of  overestimation  does  not  depend  strongly  on  the  discretization 
parameters. 

To  carry  out  the  test,  we  specify  a  true  cumulative  distribution  function  (c.d.f.) 
and  sample  TV  points  at  random  from  the  distribution.  To  each  sample  value,  we  add 
a  random  error  drawn  at  random  from  another  distribution.  We  use  the  Kaplar-Meier 
estimate  for  the  approximate  c.d.f.  in  the  Matlab  statistics  toolbox  and  then  compute 
the  difference  with  the  true  c.d.f.  values  at  the  sample  points.  We  also  compute  the 
difference  divided  by  the  true  c.d.f.  values. 

The  experiments  we  report  include  the  following. 


First  computation 
Sample  distribution 
Error  distribution 

Normal,  mean  1,  variance  2 

Uniform  on  [—<5,  d] 

Second  computation 
Sample  distribution 
Error  distribution 

Exponential,  parameter  1 

Uniform  on  [—5,  (5| 

Third  computation 
Sample  distribution 
Error  distribution 

Exponential,  parameter  1 

Uniform  on  (— (5A,<5A),  A  =  sample  value 

We  obtained  similar  results  for  a  variety  of  examples. 

In  Figure  4.1,  we  present  three  examples  of  approximate  c.d.f.  functions.  In  all 
cases,  we  bound  the  error  with  probability  greater  than  95%.  In  Figure  4.2,  we  plot 
the  95%  confidence  level  bound  calculated  from  (4.11)  and  compare  this  to  the  actual 


errors. 
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Fig.  4.1.  Plots  of  approximate  (solid  line)  and  true  (dashed  line)  c.d.f.  functions.  Left:  first 
computation  with  M  =  5000,  <5  =  .001.  Middle:  second  computation  with  N  =  2000,  6  =  .0001. 
Right:  third  computation  with  ff  =  500,  <5  =  .05. 


Fig.  4.2.  Plots  of  the  95%  confidence  level  bound  calculated  from  (4.11)  for  the  examples  shown 
in  Figure  4.1.  Left:  first  computation  with  N  —  5000  ,  6  —  .001.  Middle:  second  computation  with 
N  =  2000,  S  =  .0001.  Right:  third  computation  withN  =  500,  <5  =  .05. 


lirror  Emw 

FfC.  4.3.  Performance  of  the  bound  (4.11)  for  the  three  examples  shown  in  Figure  4.1.  Left: 
plot  of  the  difference  between  the  estimate  and  bound  versus  the  error.  Right:  plot  of  the  relative 
difference  versus  the  error. 


In  Figure  4.3,  we  plot  the  performance  of  the  bound  with  respect  to  estimating 
the  size  of  the  error.  In  all  three  cases,  the  bound  is  asymptotically  around  5  times 
too  large. 

5.  Adaptive  error  control.  We  now  use  Theorems  2.2,  3.2,  and  4.1  to  con¬ 
struct  an  adaptive  error  control  algorithm.  The  computational  parameters  we  wish  to 
optimize  are  the  mesh  size  h,  the  number  of  terms  in  the  truncated  Neumann  series  V, 
the  number  of  iterations  in  the  domain  decomposition  algorithm  I,  and  the  number 
of  samples  M.  The  first  task  is  to  express  the  error  5  as  a  sum  of  three  terms  cor¬ 
responding,  respectively,  to  discretization  error,  error  from  the  incomplete  Neumann 
series,  and  error  from  the  incomplete  domain  decomposition  iteration. 
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Considering  the  problem  with  randomly  perturbed  diffusion  coefficient,  we  bound 
the  leading  expression  in  the  error  estimate  (2.9)  in  one  sample  value  as 

8^  =  |(/,  .^)| 

<  |(/,  -  (A"V(f/5,,^  - 

(5.1)  +  l(A"V({/^,<„  - 

+  |(A"v(fy",^-t/^,^),v#;i_^)| 

=  8i  +  8J\  +  £i"i, 

where  we  use  the  obvious  notation  to  denote  the  quantities  obtained  by  taking  J,P  —> 
oo. 

The  goal  is  to  estimate  £{\,  and  8{\i  in  terms  of  computable  quantities.  To  do 
this,  we  introduce  AJ  and  AV  as  positive  integers  and  use  the  approximations 

jrn  ~ 

L'-p+A-p.X+AI  ~  1^00,00- 

The  accuracy  of  the  estimates  below  improves  as  AJ  and  AV  increase. 

We  have 


(5.2)  8P  ^  !(/,  $^_j)  -  (A"V[/^,i,  -  (A"V([/^^.^p,j+^j  - 1/",^),  V<py)j. 

Likewise,  we  estimate 

(5.3)  ^il«|(A"V((;f:,+^,-[/^,j),V$^j)|, 

(5.4)  ^lll  ~  ~  ^h,P,x)’^^p,t)i- 

We  can  find  other  expressions  for  £ii5i  by  passing  to  the  limit  in  (2.3)  on  each 
domain  d  to  write 

OO 

p=0 


while 


v-i 


p=0 

Subtracting  and  approximating,  we  find 

fjn.d  _  fjn,d 
‘-^00,00  '^P,oo 


E  ((-^"■‘')’’((k“’‘')-'k'')P)  (k“-'')-i(6‘'(/)  +  6"-''(A",(/^’;';,)) 

p=v 

V-1 

+  E  ((-A’''‘')p((k“’‘')-ik‘')p)  (k“''')-i(i;"’''(A". -5"-‘'(A",f/;;l)). 


p=0 
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Summing  yields 


fjn,d  _  fjn,d 
^00,00  ^P,oo 


v-i 

p=0 

Finally,  approximating  yields 

fjn,d  _  ijn,d 
^V,oo 

v-i 

+  E  ((->l"-'^)'’((k“’'')'''k'')P)  (k“-'')-»(6”'''(A".Lf":^E,l)  - 

p=0 

We  denote  the  operators  corresponding  to  k'*  and  k  on  Vh„d  by  fc“  and  k,  respectively. 
We  have 

(5.5)  «  E  f  j). 


+  E  ((-A"''')'’((k“’'')-ik'')P)  (k“’'')-'(6"’''{A".t/;-^lp,j.)  -6"-''(AM/;;^')) ). 

p=Q  ^ 

We  now  present  an  adaptive  error  control  strategy  in  Algorithm  3  based  on  The¬ 
orem  4.1  and  the  approximations 


We  set 


£"«£"  =  f,"  -I-  5(1  +  ^^lu- 


5[  =  max5",  5[[  =  max^d,  5iii  =  max^di- 


We  define  in  addition 


for  a  given  e  >  0. 

5.1.  A  numerical  example.  We  apply  the  adaptive  algorithm  to  the  problem 
given  in  section  2.4.  We  start  with  a  coarse  mesh  and  small  number  of  iterations, 
terms,  and  samples  and  let  the  adaptive  algorithm  choose  the  parameter  values  in 
order  to  make  the  error  bound  of  F(t)  smaller  than  15%  with  95%  likelihood;  i.e.,  we 
set  TOL  =  0.15  and  e  =  .05.  We  set  Oj  =  0.5,  02  =  03  =  0.125,  and  04  =  0.25. 

Initially,  we  let  h  =  1/18  determine  a  uniform  initial  mesh,  I  =  40,  P  =  1,  and 
/\f  =  GO.  We  set  AT  =  0.3T  and  AV  =  1.  We  compute  the  adjoint  solution  using 
a  refined  mesh  with  h  =  /i/2  but  using  the  same  number  of  iterations,  terms,  and 
samples  as  the  forward  problem.  To  refine,  we  set  hi  =  l/(9{f  —  1)),  with  i  =  3 
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Algorithm  3.  Adaptive  algorithm 
Choose  £  in  Theorem  4.1,  which  determines  the  reliability  of  the  error  control 
Let  TOL  be  the  desired  tolerance  of  the  error  |F(<)  —  Fj^{t)\ 

Let  O’]  +(72 +<73 +174  =  1  be  positive  numbers  that  are  used  to  apportion  the  tolerance 
TOL  between  the  four  contributions  to  the  error,  with  values  chosen  based  on  the 
computational  cost  associated  with  changing  the  four  discretization  parameters 
Choose  initial  meshes  7^,  7^  and  P,  X,  and  M 

Compute  {f/p  j,  n  =  1,  •  •  ■ ,  N)  in  the  space  V/,  and  the  sample  quantity  of  interest 
values 

Compute  n  =  1, . . .  ,N]  in  the  space  V;, 

Compute  £{1,  £[ii  for  n  =  I, . . .  ,J\f 

Compute 

Estimate  the  Lipschitz  constant  L  of  F  using  F// 

Compute  Eiv 

while  fiv  +  Zn=i  {HQ"  -  <t<Q"  +  1^'“!))  I  >  TOL  do 

if  L£i  >  (7iT0L  then 

Refine  7/,  and  7],  to  meet  the  prediction  that  £\  w  (7iTOL  on  the  new  mesh 
end  if 

if  L£ii  >  (72TOL  then 

Increase  V  to  meet  the  prediction  £ii  «  (72TOL 
end  if 

if  LSiii  >  <73T0L  then 

Increase  V  to  meet  the  prediction  £ni  asTOL 
end  if 

if  Eiv  >  (74TOL  then 

Increase  Af  to  meet  the  prediction  Sw  «  (74TOL 
end  if 

Compute  {(/p  j,n  =  1, . . .  ,7^}  in  the  space  V/,  and  the  sample  quantity  of  interest 
values 

Compute  f’  =  Ij  •  •  •  1  TV}  in  the  space 

Compute  £j5,  Xj"]  for  n  =  1, . . . ,  Af 

Compute  Fj\f{t) 

Estimate  the  Lipschitz  constant  L  of  F  using  py 
Compute  £i\f 

end  while 


initially,  and  then  for  each  refinement  we  increment  i  by  2.  This  means  that  we  get 
3,  5,  7,  etc.  nodes  in  the  x-direction  and  y-direction  on  each  subdomain. 

In  Figure  5.1  we  present  the  parameter  values  for  each  of  the  iterates.  The 
tolerance  was  reached  after  three  iterations  with  h  =  1/54,  X  =  160,  F  =  3,  and 
N  =  240.  In  Figure  5.2,  we  plot  error  bound  indicators  after  each  iteration  in  the 
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Fig.  5.1.  Computational  parameters  chosen  adaptively  according  to 


the  adaptive  algorithm. 


Fig.  5.2.  The  error  estimators  computed  after  each  iteration  in  the  adaptive  algorithm. 


adaptive  algorithm  and  the  total  error  bound.  We  compute  an  approximate  error 
using  a  reference  solution  with  h  =  1/72,  X  =  300,  P  =  5,  and  A/”  =  480  and 
show  the  result  in  Figure  5.3.  The  error  decreases  from  almost  100%  initially,  with  a 
distribution  function  that  fails  to  detect  critical  behavior,  to  an  error  of  around  30% 
to  finally  an  error  less  than  3%. 

6.  Conclusion.  In  this  paper,  we  consider  the  nonparametric  density  estima¬ 
tion  problem  for  a  quantity  of  interest  computed  from  solutions  of  an  elliptic  partial 
differential  equation  with  randomly  perturbed  coefficients  and  data.  We  focused  on 
problems  for  which  limited  knowledge  of  the  random  perturbations  is  known.  In 
particular,  we  assume  that  the  random  perturbation  to  the  diffusion  coefficient  i.s  de¬ 
scribed  by  a  piecewise  constant  function.  We  derive  an  efficient  method  for  computing 
samples  and  generating  an  approximate  probability  distribution  based  on  Lion’s  do¬ 
main  decomposition  method  and  the  Neumann  series.  We  then  derive  an  a  posteriori 
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Fig.  5.3.  Approximate  error  in  the  solutions  produced  by  the  adaptive  algorithm  for  iterations 
1,  2,  and  3. 


error  estimate  for  the  computed  probability  distribution  reflecting  all  sources  of  deter¬ 
ministic  and  statistical  errors,  including  discretization  of  the  domain,  finite  iteration 
of  the  domain  decomposition  iteration,  finite  truncation  in  the  Neumann  series,  and 
the  effect  of  using  a  finite  number  of  random  samples.  Finally,  we  develop  an  adaptive 
error  control  algorithm  based  on  the  a  posteriori  estimate. 
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A  MEASURE- THEORETIC  COMPUTATIONAL  METHOD  FOR 
INVERSE  SENSITIVITY  PROBLEMS  I:  METHOD  AND  ANALYSIS' 

J.  BREIDTt,  T.  BUTLER*,  AND  D.  ESTEP* 

Abstract.  We  consider  the  inverse  sensitivity  analysis  problem  of  quantifying  the  uncertainty 
of  inputs  to  a  deterministic  map  given  specified  uncertainty  in  a  linear  functional  of  the  output  of  the 
map.  This  is  a  version  of  the  model  calibration  or  parameter  estimation  problem  for  a  deterministic 
map.  We  assume  that  the  uncertainty  in  the  quantity  of  interest  is  represented  by  a  random  variable 
with  a  given  distribution,  and  we  use  the  law  of  total  probability  to  express  the  inverse  problem 
for  the  corresponding  probability  measure  on  the  input  space.  Assuming  that  the  map  from  the 
input  space  to  the  quantity  of  interest  is  smooth,  we  solve  the  generally  ill-posed  inverse  problem 
by  using  the  implicit  function  theorem  to  derive  a  method  for  approximating  the  set- valued  inverse 
that  provides  an  approximate  quotient  space  representation  of  the  input  space.  We  then  derive  an 
efficient  computational  approach  to  compute  a  measure  theoretic  approximation  of  the  probability 
measure  on  the  input  space  imparted  by  the  approximate  set-valued  inverse  that  solves  the  inverse 
problem. 

Key  words,  adjoint  problem,  density  estimation,  inverse  sensitivity  analysis,  model  calibration, 
nonparametric  density  estimation,  parameter  estimation,  sensitivity  analysis,  set-valued  inverse 

AMS  subject  classifications.  60-08,  34F05 

DOI.  10.1137/100785946 

1.  Introduction.  We  develop  and  analyze  a  numerical  method  to  solve  the  in¬ 
verse  sensitivity  analysis  problem;  Given  a  specified  variation  and/or  uncertainty  in 
the  output  of  a  smooth  map,  determine  variations  in  the  input  parameters  that  pro¬ 
duce  the  observed  uncertainty.  We  formulate  this  inverse  problem  using  probability 
to  describe  variation  by  assuming  that  the  inputs  and  outputs  are  random  variables. 
This  inverse  problem  has  an  abstract  interpretation  in  which  the  density  is  imposed 
on  the  output  in  order  to  observe  the  consequences  for  the  inputs.  It  also  has  an 
experimental  interpretation  in  which  the  model  output  matches  observed  values  of  an 
experiment  and  the  imposed  density  is  associated  with  the  experimental  data,  i.e., 
reflecting  the  uncertainty  in  the  data  or  arising  as  a  consequence  of  experimental 
error. 
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To  motivate  this  inverse  sensitivity  analysis  problem,  consider  the  situation  of  a 
manufacturer  who  will  purchase  a  large  number  of  metal  plates  of  a  given  alloy  and 
thickness  that  are  to  be  used  subsequently  in  a  high  temperature  environment.  In 
order  to  ensure  the  plates  maintain  integrity,  the  manufacturer  specifies  that  a  given 
heat  load  must  be  distributed  quasi-uniformly  after  ten  minutes  of  exposure,  with 
some  conditions  on  how  much  the  temperature  may  vary  through  the  plate.  The 
plates  are  milled  with  variations  in  the  purity  of  the  alloy  and  the  thickness  of  the 
plates,  both  of  which  affect  the  heat  distribution  under  load.  To  check  a  batch  of  plates 
to  see  if  it  meets  the  requirements,  the  manufacturer  tests  the  heat  specification  on  a 
random  sample  of  plates  drawn  from  the  batch.  The  random  selection  of  samples,  the 
variation  in  plate  properties,  and  measurement  error  combined  lead  to  a  description 
of  the  test  results  as  a  random  variable.  After  delivery,  the  manufacturer  decides  that 
knowing  the  statistics  on  the  size  of  the  plates  and  the  composition  of  the  alloy  would 
be  useful.  The  heat  equation  models  the  heat  distribution  under  a  given  load  once 
the  conductivity  determined  by  the  alloy  composition  and  the  thickness  of  the  plates 
are  specified.  The  inverse  sensitivity  problem  is  to  determine  the  distribution  on  the 
space  of  parameters  consisting  of  the  thickness  and  alloy  purity  from  the  distribution 
of  the  results  of  the  heat  experiments  on  the  plates. 

The  probabilistic  inverse  problem  can  be  described  more  precisely  as  follows. 

Given 

®  a  model  M{Y,X)  with  solution  Y  =  G{\)  depending  on  parameters  and  data 
A  in  parameter  space  A  C 

o  a  linear  functional  q(A)  =  q(K(A))  taking  values  in  an  output  space  V, 

o  an  observed  probability  density  Pv(<l(X})  =  Pt>(^{Y (X)))  on  the  output  value 

determine 

o  a  probability  density  on  the  parameter  space  A  that  produces  the 

observed  density. 

We  assume  the  model  M{Y,X)  depends  smoothly  on  the  inputs,  so  the  map  q{X)  is 
implicitly  a  smooth  and  deterministic  function  of  A. 

There  are  several  important  issues  associated  with  this  problem.  In  general,  the 
parameter  space  is  multidimensional  while  there  is  a  single  observation  (or  a  low  di¬ 
mensional  set  of  observations  at  most).  So,  the  inverse  problem  is  ill-posed  in  the 
sense  that  the  inverse  solution  of  the  deterministic  model  is  set-valued.  Under  the 
assumption  of  a  smooth  model,  we  address  this  issue  by  constructing  a  systematic 
method  for  approximating  set- valued  inverses.  Second,  we  are  particularly  interested 
in  models  that  are  complicated  and/or  expensive  to  evaluate,  e  g.,  requiring  the  solu¬ 
tion  of  a  differential  equation,  so  that  the  map  to  the  output  is  determined  implicitly. 
We  address  this  issue  by  using  adjoint  operators  [22,  20,  6,  21,  23,  12,  13,  9,  10,  7,  11] 
to  compute  the  required  derivative  information.  Third,  while  probability  densities 
describe  random  variables,  the  densities  themselves  are  not  random.  Common  ap¬ 
proaches  to  approximating  probability  densities  often  use  a  random  representation 
obtained  by  some  variation  of  Monte  Carlo  sampling  [14,  17,  18);  however,  this  is  not 
a  requirement.  In  particular,  the  approach  described  in  this  paper  is  not  stochastic, 
rather  it  is  based  on  the  simple  approximation  commonly  used  in  measure  theory. 

In  this  paper,  we  present  the  basic  method  and  analysis  of  a  measure-theoretic 
computational  approach  for  the  probabilistic  inverse  sensitivity  analysis  problem.  In 
[4],  we  present  a  numerical  analysis  of  the  discretization  error  that  arises  when  evalu¬ 
ating  the  model  by  numerical  solution  and  using  a  finite  number  of  random  samples 
to  represent  the  distribution  on  the  output  quantity.  In  [5],  we  discuss  the  problem 
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of  dealing  with  multiple  quantities  of  interest,  which  has  application  to  data  assimi¬ 
lation  and  “cascaded”  uncertainty  in  operator  decomposition  solution  of  multiphysics 
problems. 

This  paper  is  structured  as  follows.  In  section  2,  we  formulate  the  probabilistic 
inverse  problem  that  we  solve  and  discuss  the  relation  to  a  Bayesian  inverse  problem. 
In  section  3.1,  we  deal  with  the  set-valued  nature  of  the  inverse  problem  by  introducing 
a  theory  of  generalized  contours  and  explain  how  the  generalized  contours  can  be 
approximated.  In  section  3.2,  we  develop  a  computational  measure  theoretic  method 
for  approximating  the  inverse  parameter  distribution  using  approximate  generalized 
contours.  In  section  4,  we  apply  the  method  to  a  variety  of  problems.  Finally,  section  5 
summarizes  the  work. 

2.  Formulation  of  the  probabilistic  inverse  problem.  The  inverse  problem 
we  study  is  the  direct  inversion  of  the  forward  stochastic  sensitivity  analysis  problem 
for  a  deterministic  model.  We  consider  a  deterministic  operator  q{\)  that  maps  values 
in  a  parameter  space  A  to  an  output  space  T).  We  assume  there  is  a  parameter 
volume  measure  p.\  on  A  that  determines  the  volume  of  sets  in  A.  The  volume 
measure  depends  on  the  units  of  measure  used  for  the  parameters  and  also  reflects 
the  structural  dependency  among  the  parameters,  e.g.,  depending  on  whether  or  not 
Pa  is  a  product  measure.  The  volume  measure  is  specified  as  part  of  the  model  that 
defines  the  map  q{\)  since  the  parameters  must  be  explicitly  defined  in  the  physical 
model  that  determines  q.  We  assume  that  pa  is  absolutely  continuous  with  respect 
to  the  Lebesgue  measure  and  the  volume  K  of  A  is  finite. 

We  first  describe  the  forward  stochastic  sensitivity  analysis  for  the  deterministic 
map  q{\).  We  assume  that  a  probability  density  ffA(-^)  is  specified  on  the  parameter 
space  A.  This  density  distinguishes  the  probability  of  different  events  in  A,  i.e.,  the 
probability  of  an  event  A  in  A,  by  which  we  mean  a  measurable  set  of  values,  is 
computed  via 


P{A)=  /  aA(A)dpA(A). 

Ja 

The  deterministic  model  can  be  expressed  in  terms  of  a  likelihood  function  L{q  ]  A)  of 
the  output  q  values  given  the  input  parameter  values  A,  where  L[q  |  A)  =  6{q  —  q{\)) 
is  the  unit  mass  distribution  at  ^  =  q{\).  This  implies  the  fundamental  relationship 

(2.1)  Law  of  Total  Probability  Pv{q\A)  =  ^ \  "L\^ 

Ja  ‘7A(A)dpA(A) 

This  is  a  Fredholm  integral  equation  of  the  first  kind  that  determines  a  conditional 
probability  density  pT){q\A)  on  the  output  given  that  the  parameters  come  from  A. 
Thus,  we  may  determine  the  conditional  probability  of  event  B  CV  as 

p(sw  =  f 

Jb  JA<^A{A)apA{A) 

For  forward  sensitivity  analysis  it  is  common  to  take  A  =  A  so  that  P{B\A)  =  P{B), 
and  we  arrive  at  the  common  form  for  the  law  of  total  probability  given  by 

Pv{q)=  [  L{q\\)a^{\)dp.n,{\). 


(2.2) 
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This  describes  an  analogue  of  a  Perron-Frobenius  map  where  the  deterministic  map 
q{X)  defines  a  transformation  of  the  density  to  This  forward  sensitivity 

analysis  problem  is  often  solved  using  a  Monte  Carlo  approach:  Random  parameter 
sample  values  A  are  drawn  from  the  distribution  a\  on  the  parameter  space;  corre¬ 
sponding  values  of  q{\)  are  computed;  and  these  values  are  binned  to  produce  an 
approximate  probability  distribution  on  the  output. 

The  stochastic  inverse  sensitivity  analysis  problem  that  we  study  is  the  inversion 
of  the  Law  of  Total  Probability  (2.2). 

We  assume  that  an  observed  probability  density  Pvl^ti^))  m  given 
on  the  output  value  q{\),  and  we  seek  to  compute  the  corresponding 
parameter  density  (T\{X)  that  yields  Pt>(q(^))  via  (2.2). 

It  is  important  to  note  that  what  we  seek  for  the  solution  of  the  inverse  problem  is  the 
actual  probability  density  that  can  be  used  to  compute  the  probability  of  events  in  the 
parameter  space  A.  In  other  words,  we  seek  to  compute  the  inverse  of  the  analogue 
of  the  Perron-Fiobenius  map  between  the  densities  on  the  input  and  output  spaces. 
The  purpose  of  this  paper  is  to  describe  a  method  for  solving  the  inverse  problem 
by  providing  a  way  to  approximate  the  probability  of  an  arbitrary  event  in  the  input 
space.  This  can  be  used  subsequently  to  generate  an  approximation  of  the  inverse 
density  and/or  to  compute  any  desired  statistical  moments  of  the  inverse  density. 

We  emphasize  the  fundamental  role  of  the  underlying  parameter  volume  measure 
p\  in  defining  the  solution  of  the  inverse  problem.  In  particular,  the  a  priori  specifi¬ 
cation  of  /iA  imposes  the  structure  of  the  measure  on  A,  e.g.,  whether  the  measure  on 
A  is  a  product  measure  or  not.  In  general,  there  are  many  combinations  of  (ta  and 
/tA  that  can  yield  a  given  observed  density  on  the  output. 

We  provide  a  simple  illustration  of  the  inverse  problem  using  the  map 

q[X)  =  Ai  A2, 


where  Ai,A2  are  random  variables.  For  the  inverse  problem,  we  specify  that  q{X) 
has  a  ^(0, 2/25)  distribution  and  seek  to  determine  the  parameter  distribution  (Ta('^) 
that  yields  the  specified  output  density.  This  output  distribution  can  be  generated  by 
choosing  Ai ,  A2  to  be  independent  identically  distributed  IV(0, 1/25)  random  variables; 
see  Figure  2.1.  As  well,  we  could  choose  any  bivariate  normal  density 


with  2t^(1  -I-  p)  = 


25’ 


If  we  find  a  distribution  on  A  that  generates  q{X)  according  to  a  ^(0,2/25)  distri¬ 
bution,  then  we  accept  this  as  a  solution  to  the  inverse  problem.  The  choice  of  the 
underlying  parameter  volume  measure  p\  is  critical  to  this  task.  In  Figures  2. 1-2.3, 
we  show  five  different  probability  densities  a\{X)  that  yield  the  identical  AI(0.2/25) 
density  on  q{X).  Each  of  the  five  different  densities  correspond  to  five  different  under¬ 
lying  volume  distributions  p,\  as  shown. 

The  specification  of  p\  has  to  do  with  how  measurements  in  A  are  carried  out 
and  the  relationships  between  the  parameters.  As  noted,  the  volume  measure  should 
be  specified  as  part  of  defining  the  model.  In  many  situations  involving  deterministic 
models,  the  product  Lebesgue  measure  appropriately  scaled  to  account  for  units  is  the 
natural  choice.  But,  this  is  not  always  the  case.  Continuing  the  motivating  problem, 
as  a  first  approximation,  we  might  consider  the  thickness  and  alloy  composition  to 
be  physically  independent  parameters  and  impose  a  product  measure  on  the  space 
formed  by  the  two  variables  using  independent  normalized  Lebesgue  measures.  A 
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Biwittc  Nonna]  Density  oo  tfK  Panmeten 


FlG.  2.1.  Left:  The  yV(0,2/25)  distribution  imposed  on  the  output  Ai  +  A2.  Right:  The  joint 
distribution  of  two  independent  N(0, 1/25)  parameters  Ai  and  X2.  Summing  these  variables  is  one 
way  to  compute  the  imposed  normal  on  the  output  quantity.  Figures  2. 2-2. 3  show  alternatives. 


■  Fig.  2.2.  The  joint  distributions  of  parameters  (Ai,A2)  sampled  with  respect  to  the  density 
(i\{X)  and  the  corresponding  volume  measure  presented  in  pairs  of  plots.  Left  two  plots:  The  volume 
measure  is  uniform  Lebesgue  on  A.  Right  two  plots:  The  volume  measure  is  uniform  Lebesgue  a  set 
with  three  distinct  parts. 


Fig.  2.3.  The  joint  distributions  of  parameters  (Ai,A2)  sampled  with  respect  to  the  density 
P\(\)  and  the  corresponding  volume  measure  presented  in  pairs  of  plots.  Left  two  plots:  The  volume 
measure  is  uniform  Lebesgue  on  the  boundary.  Right  two  plots:  The  volume  measure  is  uniform 
Lebesgue  on  a  nonconvex  interior  set. 


more  realistic  description  will  take  into  account  the  faot  that  the  thickness  of  the 
plates  indirectly  depends  on  the  alloy  composition  during  the  milling  process.  We 
can  model  the  milling  process  to  determine  the  thickness  as  an  indirect  function 
of  the  physically  independent  variables  of  pressure  in  the  milling  process  and  the 
alloy  composition.  The  meeisure  on  the  space  consisting  of  the  thickness  and  alloy 
composition  is  then  determined  by  propagating  the  product  measure  imposed  on  the 
independent  alloy  composition  and  pressure  variables  through  the  milling  model.  The 
resulting  measure  on  the  space  consisting  of  the  alloy  composition  and  thickness  will 
not  be  a  product  measure. 

The  plots  of  inverse  densities  given  in  Figures  2. 2-2. 3  also  illustrate  the  important 
point  that  injecting  probability  into  the  inverse  problem  by  itself  does  not  reduce  the 
ill-posedness,  even  after  specifying  the  parameter  volume  measure.  The  consequence 
of  ill-posedness  on  the  stochastic  inverse  problem  is  illustrated  by  the  complex  mesisure 
structure  of  the  inverse  probability  densities  in  the  plots.  For  example,  these  densities 
are  not  product  measures.  In  general,  it  is  not  possible  to  determine  densities  for  the 
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individual  parameters  without  further  information.  We  can  determine  only  a  measure 
on  the  entire  parameter  space. 

Comparison  to  a  Bayesian  inverse  problem.  There  is  another  natural  in¬ 
verse  problem  associated  with  the  Law  of  Total  Probability  (2.1)  that  is  important  in 
the  case  of  a  general  likelihood  function  L{q\X),  not  necessarily  arising  from  a  deter¬ 
ministic  map.  Namely,  we  may  use  Bayes’  theorem  to  invert  the  likelihood  function  to 
obtain  the  “posterior  density”  p(A|9)  given  the  “prior  density”  a\  on  the  input  space 
A  and  a  “data  density”  pu  on  the  output  space  V.  We  emphasize  that  the  solution  of 
this  Bayesian  inverse  problem  is  a  conditional  distribution.  This  is  very  natural  when 
the  map  from  the  input  to  output  space  has  been  modeled  statistically  by  specifying 
L{q\X)  given  information  about  the  statistical  properties  of  the  input  parameters  and 
output,  quantity,  e.g.,  when  the  map  is  derived  empirically,  rather  than  from  physical 
principles. 

This  Bayesian  inverse  problem  is  at  the  heart  of  Bayesian  inference  [26,  1,  19,  18]. 
In  this  approach,  the  inferential  target  is  a  single,  unknown  parameter  (or  parameter 
vector)  A.  We  are  given  data  in  the  form  of  observations  qi,  ■  ■  .,qn,  for  which  a  typical 
assumption  is  conditional  independence, 

n 

(2.3)  p('7i,.-.,9n|A)~ 

i=l 

where  {p{qi  \  A)}  are  conditional  probability  densities  with  respect  to  some  appropriate 
measure,  and  are  specified  up  to  the  value  of  A.  The  right-hand  side  of  (2.3)  is 
the  likelihood  of  the  observations  given  the  parameter.  We  are  also  given  a  prior 
distribution  on  A  that  gives  a  probabilistic  description  of  the  uncertainty  about  the 
values  of  A  before  any  data  are  observed.  This  prior  distribution  is  exactly  crA(A)  in 
the  notation  used  above.  Bayesian  inference  then  proceeds  by  using  Bayes’  theorem  to 
compute  the  posteriori  conditional  distribution  of  A  given  the  observations  qi,. . .  ,qn. 

n 

(2.4)  p(A  I  f/i , . . . ,  qn)  <x  p{qu  •  •  • ,  7n  I  A)  <Ta(A)  =  np(9i  |  A)crA(A). 

i=l 

We  could  adopt  a  Bayesian  approach  to  solve  the  inverse  problem  we  study  by 
modeling  o-a(A)  parametrically  as  crA(A|0)  in  terms  of  new  (lower-dimensional)  pa¬ 
rameters  9.  This  is  known  as  a  mixture  or  hierarchical  model.  In  Bayesian  terminology, 
o'a(A  I  9)  is  the  prior  while  a  new  distribution  ae  describing  9  is  the  hyperprior.  As¬ 
suming  that  the  hyperprior  is  specified,  we  then  compute  the  posterior  distribution 
on  9  given  “data”  from  pp(g(A)).  Any  desired  inferences  about  the  distribution  of  A 
given  0  can  then  be  obtained  from  the  posterior.  The  difficulty  with  this  approach  is 
specifying  a  reasonable  conditional  model,  which  is  difficult  to  verify  empirically. 

The  inverse  problem  solved  in  this  paper  shares  some  characteristics  with  the 
Ba.yesian  inverse  problem,  but  has  fundamental  differences  as  well.  In  the  Bayesian 
problem,  the  inferential  target  is  the  parameter  A,  and  (Ta  is  given  as  prior  information. 
The  likelihood  L{q  \  A)  typically  involves  a  nontrivial  stochastic  structure  and  is  not 
deterministic. 

By  contrast,  in  the  inverse  problem  we  solve  the  inferential  target  is  the  distribu¬ 
tion  ffA,  which  is  not  given  as  the  prior.  Further,  our  likelihood  L{q  \  A)  is  given  by 
a  deterministic  map,  which  completely  determines  the  set-valued  inverse. 

The  choice  of  inverse  problem  to  solve  depends  completely  on  the  available  in¬ 
formation.  In  the  case  of  a  deterministic  physics-based  model,  the  unknowns  and 
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quantities  subject  to  uncertainty  are  the  data  and  parameter  values  that  are  input 
into  the  model  and  the  observations  that  are  supposed  to  match  model  output  while 
the  likelihood  function  determined  by  the  map  is  completely  trivial  in  a  statisti¬ 
cal/probabilistic  sense.  Based  on  the  law  of  total  probability,  the  inverse  problem 
we  solve  is  the  direct  inverse  of  the  probabilistic  forward  sensitivity  problem  for  a 
deterministic  model. 

3.  Solving  the  inverse  problem.  As  noted  above,  while  probability  densi¬ 
ties  describe  the  random  nature  of  a  random  variable,  the  densities  themselves  are 
not  random.  While  a  common  approach  to  compute  a  discrete  approximation  of  a 
probability  density  employs  random  sampling,  this  is  not  necessary.  In  this  paper, 
we  describe  a  method  for  computing  approximate  probability  densities  that  does  not 
require  random  sampling.  Our  approach  breaks  the  solution  down  into  two  stages: 

1.  Construct  an  approximate  representation  of  the  set- valued  inverse  solution  of 
the  deterministic  model. 

2.  Use  measure-theoretic  computational  methods  to  approximate  the  probability 
density  (measure)  structure  on  the  parameter  space  that  corresponds  to  the 
set-valued  inverse  and  the  observed  output  density. 

These  are  independently  interesting  tasks. 

We  present  a  brief  overview  before  providing  the  details.  Under  the  assumption 
of  a  smooth  map,  if  we  are  given  a  fixed  output  value  g  G  23,  then  the  implicit 
function  theorem  guarantees  the  existence  of  a  (d  —  l)-dimensional  manifold  in  A 
that  is  mapped  to  q.  Motivation  comes  from  the  two-dimensional  case,  A  =  (Ai,  A2), 
where  the  manifolds  are  contours  of  the  surface  q{^i,^2)  (left-hand  illustration  in 
Figure  3.1).  Every  point  in  A  lies  on  a  unique  contour,  so  we  may  consider  A  as 
a  set  described  by  its  contours.  The  set  of  (generalized)  contours  is  an  equivalence 
class  in  the  input  space,  i.e.,  a  quotient  space  representation  of  the  input  space.  In 
A,  there  exists  1-dimensional  curves  transverse  to  the  contours  that  intersect  each 
contour  once  and  only  once  (right-hand  illustration  in  Figure  3.1).  We  can  take  one 
of  these  curves  as  the  index  for  the  set  of  contours.  There  is  a  bijection  between  the 
points  on  an  index  curve  and  the  points  in  the  range  of  the  output  g(A).  Therefore, 
any  measure  posed  on  the  range  of  the  output  imposes  a  measure  on  the  index  curve. 
Thus,  the  intersections  of  the  contours  with  the  index  curve  is  a  random  variable  with 


Fig.  3.1.  Left:  Each,  observation  value  corresponds  to  a  unique  contour  curve.  Right:  On  the 
horizontal  plane,  we  show  a  transverse  parameterization.  Each  point  on  the  transverse  parameter¬ 
ization  corresponds  to  a  unique  contour  curve,  so  the  transverse  parameterization  acts  as  an  index 
for  the  space  of  contour  curves.  There  is  a  unique  map  from  the  points  in  the  interval  containing 
the  observed  output  values  to  the  points  on  the  transverse  parameterization. 
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A  Sample  of  Contours 

Fig.  3.2.  Left:  We  show  a  probability  distribution  imposed  on  the  output  values.  A  sample  of 
output  values  drawn  from  this  distribution  corresponds  to  a  unique  sample  of  contour  curves.  Right: 
Plotted  is  a  sample  of  contour  lines  in  parameter  space  corresponding  to  a  specified  distribution 
on  the  output  observation  values  along  with  three  events.  We  specify  the  Lebesgue  measure  as  the 
parameter  volume  measure.  Event  B  has  relatively  low  probability  because  while  it  has  relatively  large 
area,  the  probability  of  the  contours  is  relatively  low  (visible  becatise  the  density  is  sparse).  Event 
A  has  intermediate  probability  because  while  the  area  of  event  A  is  relatively  small,  A  contains 
contours  with  relatively  high  probability  (which  is  visible  because  of  the  dense  sample  of  contours). 
The  probability  of  event  C  is  largest  because  it  contains  the  same  high  probability  contours  as  A  but 
has  larger  area. 


a  distribution  uniquely  defined  by  the  distribution  of  the  output  Pv{q{^))  (left-hand 
illustration  in  Figure  3.2).  In  other  words,  there  exists  a  unique  solution  to  the  inverse 
sensitivity  analysis  problem  in  the  set  of  the  contours. 

However,  determining  the  set  of  contours  analytically  is  infeasible  in  practice.  In 
[23],  the  forward  sensitivity  analysis  problem  defined  by  (2.2),  where  a  given  density 
o'a(A)  is  propagated  through  the  output  surface  </(A),  is  solved  using  a  piecewise- 
linear  tangent  plane  approximation  to  the  output  surface.  This  requires  computations 
involving  only  inner  products,  which  is  cheap  compared  to  the  full  model  evaluation 
cost  of  q{\)  for  each  new  value  of  A.  The  derivatives  of  ^(A)  are  computed  implicitly 
using  adjoint  methods.  Motivated  by  this  approach,  we  use  a  piecewise-linear  tangent 
plane  approximation  to  the  output  surface  q{\)  to  construct  approximate  contours 
and  an  approximate  index  set. 

The  next  step  is  to  determine  the  probability  density  on  the  parameter  set  that 
corresponds  to  the  distribution  on  the  transverse  parameterization  of  the  space  of 
approximate  contours.  In  order  to  assign  a  probability  to  a  measurable  set  in  A,  we 
first  recognize  that  such  a  set  is  defined  by  the  contours  it  contains  and  the  amount  of 
each  contour  it  contains  (right-hand  illustration  in  Figure  3.2).  The  parameter  volume 
measure  pa  specified  on  A  quantifies  the  amount  of  each  contour  contained  in  any 
given  set.  Combining  the  results  of  the  generalized  contours  with  such  a  measure,  the 
monotone  convergence  theorem,  and  additivity  properties  of  measures,  we  develop  an 
algorithm  to  estimate  the  probability  of  any  measurable  set  in  A.  This  algorithm 
employs  a  piecewise  constant  approximation  of  measures  that  is  commonly  used  in 
measure  theory.  This  yields  a  direct  computational  method  to  approximate  crA(A). 

In  the  next  two  sections,  we  provide  details  of  the  two  ingredients  of  the  approx¬ 
imate  solution  method. 

Remark  3.1.  Many  solution  methods  for  both  statistical  and  deterministic  inverse 
problems  deal  with  ill-posedness  by  introducing  some  form  of  regularization,  either 
directly  or  reposing  the  inverse  problem  as  an  optimization  problem.  Such  iriethods 
avoid  the  need  to  deal  with  set-valued  inverse  solutions. 
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Remark  3.2.  There  are  cases  of  interest,  e.g.,  a  parameter  domain  that  contains  a 
bifurcation  point,  for  which  the  described  method  cannot  be  used  in  a  straightforward 
fashion.  We  note  that  while  an  approach  based  on  random  sampling  may  be  applied 
nominally  to  such  a  problem,  the  interpretation  of  the  results  is  still  problematic. 

Remark  3.3.  While  the  solution  method  for  the  inverse  problem  proposed  here  re¬ 
lies  on  derivatives  of  a  quantity  of  interest,  it  is  not  dependent  on  how  those  derivatives 
are  computed.  Instead  of  an  adjoint- based  approach,  the  derivatives  might  be  com¬ 
puted  using  (deterministic)  forward  sensitivity  analysis  that  computes  the  derivatives 
directly  along  with  the  solution  of  the  model.  Yet  another  approach,  presented,  e.g., 
in  [27],  employs  a  stochastic  spectral  method  to  obtain  a  polynomial  representation 
of  g(A),  which  is  then  used  to  compute  gradients. 

1 

3.1.  Determining  the  inverse  of  the  deterministic  model  using  general¬ 
ized  contours.  We  consider  a  finite  dimensional  map  q  from  the  space  of  parameters 
to  the  output  defined  implicitly  by  solving  a  finite  dimensional  nonlinear  system  of 
equations, 

(3.1)  f{x-,X)  =  b, 

where  x  G  R”,  parameter  A  G  A  C  R'^  (assuming  that  A  is  compact)  is  a  random 
vector,  and  /  ;  — >  R"  is  assumed  smooth  in  both  variables.  The  goal  is  to 

compute  a  quantity  of  interest  q{\)  =  q(x(A))  =  (x,  ip),  described  as  a  linear  functional 
of  the  solution  x(A).  If  x  depends  smoothly  on  A,  then  the  dependence  of  g  on  A  is 
also  smooth. 

This  problem  applies  in  particular  to  differential  equations  that  depend  on  a  finite 
set  of  parameters.  For  differential  equations,  we  require  the  same  assumptions  as  the 
standard  existence  and  uniqueness  theorems  to  guarantee  the  smoothness  of  g(A). 
This  is  discussed  in  more  detail  in  the  second  part  of  this  paper  [4]. 

For  any  q  G  9(A),  we  define  q(A)  :=  q(A)  —  q.  By  assumption,  9(A)  ;  R"^  — >  R  is 
continuously  differentiable  and  there  exists  A  G  A  such  that  9(A)  =  q,  which  implies 
that  9(A)  =  0.  We  are  mainly  interested  in  the  case  where  the  quantity  of  interest 
varies  as  the  parameters  vary,  so  we  assume  that  d\jq(X)  ^  0,  i.e.  there  is  at  least  one 
nontrivial  partial  derivative.  We  may  relax  the  restriction  of  c)aj9(A)  ,1^  0  for  a  finite 
number  of  points  in  A,  where  9(A)  possibly  attains  a  local  extreme  value  and  ignore 
this  set  of  points  when  considering  the  generalized  contours. 

By  the  implicit  function  theorem,  there  exists  an  open  set  t/;;  C  A'^”',  where 
A''”'  :=  {A''"^  :=  (Ai, . . . ,  Aj_i)|A  =  (Ai , . . . ,  Aj)  G  A},  containing  A‘^”\  an  open  set 
Vx  C  Ad,  where  A,;  :=  {AjjA  G  A},  and  a  differentiable  function  such 

that 

(3.2)  {(A''-',9a(A‘'-'))}  =  {A|9(A)  =  9-}ri(t/A  x  Y;;). 

Since  the  implicit  function  theorem  is  a  local  result,  there  may  be  additional 
points  in  A  that  map  to  9,  but  are  not  contained  in  the  set  defined  by  (3.2).  Thus, 
given  9  G  g(A),  we  choose  a  collection  of  sets  {U^  x  Vj;}  =  Uq6/i{*^A„  ^  }i  where 
all  A  G  A  such  that  9(A)  =  9.  Then  using  the  same  notation 
as  in  (3.2),  the  function  gx{X‘^~^)  might  be  piecewise  defined.  The  set  in  (3.2)  is  a 
(d  — l)-dimensional  manifold  that  is  a  natural  inverse  of  9(A)  given  9.  We  call  this  set 
the  generalized  contour. 
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Theorem  3.1.  If  we  choose  distinct  91,^2  €  ^(A),  then  the  generalized  contours 
for  qi  and  q2  are  unique  and  do  not  intersect. 

Proof  The  nonintersection  property  follows  immediately  from  the  fact  that  q{\)  is 
a  function.  Uniqueness  follows  immediately  from  the  choice  x  V^}  =  Uae/i  X 
^Xr.  where  Uoe/^{^a}  the  set  of  all  A  e  A  such  that  q{X)  =  g  for  a  given  value  of 
q  e  q{A).  □ 

In  two  dimensions,  the  generalized  contours  are  simply  contours  of  the  surface 
9(Ai,A2).  We  denote  a  generalized  contour  for  a  specific  quantity  of  interest  q  as 
q~^{q).  Since  q{\)  is  smooth  and  A  is  compact,  q{A)  defines  a  compact  interval  of 
real  numbers,  Jg  :=  [qm^QM]  =  9(A),  where  q,n  and  qM  are  the  absolute  minimum 
and  absolute  maximum  of  9(A),  respectively.  We  redefine  9(A)  to  be  the  open  interval 
(97n,9At))  which  we  also  denote  by  /,. 

W(;  next  prove  that  there  exists  (possibly  discontinuous)  1-dimensional  curves 
that  aie  transverse  to  the  generalized  contours  that  can  be  used  to  index  the  family 
of  generalized  contours.  We  call  any  curve  that  has  the  property  that  it  intersects 
each  generalized  contour  once  and  only  once  a  transverse  parameterization  (TP). 

We  give  a  constructive  proof  that  is  a  useful  algorithm.  The  algorithm  produces 
discontinuous  curves  in  A  in  general. 

Theorem  3.2.  Suppose  f  is  smooth  in  (3.1)  and  9(A)  is  a  linear  functional  of  the 
solution  to  (3.1).  There  exists  a  transverse  parameterization  for  the  set  of  generalized 
contours. 

Proof.  We  construct  the  transverse  curve  from  a  finite  number  of  connected 
curves.  We  fix  £  >  0  and  £  >  J  >  0,  and  set  /,,£  =  [9^  +  e,qM  —  t].  If  A  is  compact, 
then  the  existence  of  transverse  curves  is  guaranteed  by  the  smoothness  of  9(A).  To 
construct  a  curve,  we  begin  at  a  point  7jvf  €  A  such  that  9(7m)  =  9m  —  <^>  and  follow 
the  direction  of  the  negative  gradient  until  the  curve  either  intersects  the  boundary  or 
a  minimum  or  saddle  is  reached,  and  denote  that  point  "fm-  FVom  smoothness,  exactly 
one  contour  for  each  value  of  9(A)  between  (9(7m))9(7M))  is  intersected  by  this  curve. 
If  (9(7m))9(7M))  does  not  completely  cover  /<,.«,  then  we  select  a  point  €  A  such 
that  q{Tm)  =  9m  +  di  and  follow  the  direction  of  the  gradient  until  the  curve  either 
intersects  the  boundary  or  a  maximum  or  saddle  is  reached,  and  denote  this  point 
tm.  We  now  check  if  (9(7m). 9(7m))  U  (9(7^)) 9(tm))  covers  If  so,  then  we 

eliminate  any  part  of  the  second  curve  that  gives  an  overlap  with  contours  intersected 
by  the  first.  Otherwise,  we  continue  to  create  this  curve  as  above  trying  to  cover  the 
output  interval  defined  by  (9(tm),  9(7m))  This  process  produces  a  countable  number 
of  connected  curves  whose  union  forms  a  (possibly  discontinuous)  transverse  curve 
through  the  generalized  contours  that  corresponds  to  a  countable  open  cover  of  /q,ci 
which  is  compact.  Hence,  there  is  a  finite  subcover  of  which  implies  that  the 
transverse  parameterization  can  be  constructed  from  a  finite  number  of  curves.  □ 

In  practice,  we  construct  the  transverse  curve  to  the  generalized  contours  of  /, 
by  initially  following  the  first  two  steps  above  with  e  =  0,  i.e.,  locate  -/m  G  A  such 
that  qi’yxi)  =  9m  and  Tm  ^  A  such  that  q{Tm)  =  qm  and  construct  the  pieces  of 
the  transverse  curve  by  following  the  negative  and  positive  directions  of  the  gradient, 
respectively.  If  we  now  take  e  to  be  half  the  minimum  of  q{~iM)  —  g{lm)  and  q{Tf,[)  — 
9('^m)i  then  following  the  steps  above,  we  construct  a  curve  transverse  to  all  the 
contours  of  Ig  in  a  finite  number  of  steps. 

3.1.1.  Approximating  the  set  of  generalized  contours.  Suppose  that  g  is  a 
linear  function  of  A,  i.e.,  q{\)  =  7^A  for  some  7  €  (recall  A  C  R*^).  Then  for  fixed 
9  €  9(A)  we  have  (with  the  same  conventions  as  above)  Uy,  V);,  and  93;  :  Uj  — >  such 
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that  is  the  generalized  contour.  In  this  case,  we  write  the  function 

=  (9  ~  explicitly.  The  generalized  contour  above  is  a 

(fi  —  l)-dimensional  hyperplane,  and  we  refer  to  this  as  a  generalized  linear  contour. 

We  approximate  generalized  contours  locally  by  generalized  linear  contours,  and 
approximate  a  generalized  contour  by  a  generalized  piecewise-linear  contour.  We  use 
generalized  piecewise-linear  contours  computed  from  a  piecewise-linear  tangent  plane 
approximation  to  q{X).  If  q  is  an  affine  map  of  A,  i.e.,  q{X)  =  7^A  -f  ijo  for  some 
qo  €  K,  then  we  use  the  function  above  with  q  replaced  by  q  —  go. 

We  obtain  derivative  information  required  to  compute  the  tangent  plane  approx¬ 
imations  implicitly  by  introducing  the  adjoint  operator.  This  approach  is  very  useful 
when  the  forward  map  is  complicated  to  evaluate,  e.g.,  involving  the  solution  of  a  dif¬ 
ferential  equation.  But,  the  derivative  information  can  be  obtained  by  any  convenient 
method. 

Local  linearization  of  the  linear  functional.  The  goal  is  to  approximate 
the  map  q{\)  with  a  piecewise-linear  map  iJ(A)  since  it  is  possible  to  calculate  the 
generalized  contours  for  this  approximate  map. 

Theorem  3.3.  The  generalized  linear  contours  converge  pointwise  to  the  true 
contours  locally  in  A. 

Proof.  Suppose  we  choose  a  reference  parameter  value  A  =  p  at  which  to  solve 

/(iiA)  =  6 

exactly.  Call  this  reference  solution  y.  Then  according  to  Taylor’s  theorem, 

/(r;  A)  =  /(j/;  p)  -t-  Dj:f{y,p.){x  -y)  +  Dxf{y;  fi){X  -  p)  +  71, 

where  TZ  ~  0{\\x  —  y\f  4-  ||A  — p||^),  for  |o|  =  2.  Here  D^f  and  Dxf  denote  the 
derivatives  of  /  with  respect  to  x  and  A,  respectively. 

In  order  to  compute  the  tangent  plane  approximation  efficiently,  we  use  the  gen¬ 
eralized  Green’s  vector  (p  that  solves  the  adjoint  to  the  linearized  problem 

(3.3)  A^<j>  =  ^p, 

where  A  —  Dxf{y\ p).  Recall  that  q{X)  =  {x,  rp),  so  by  substitution  of  the  above  and 
standard  linear  algebra  we  arrive  at 

g(A)  =  g(p)  -  {Dxfiy;  p)(A  -  p),  (p)  -  (7^,  (p) . 

Neglecting  the  higher  order  term  leads  to  an  approximation  of  q  by  an  affine  map 
q.  If  we  denote  the  generalized  contour  of  q  given  q  by  {(A'^“^,g;j;(A‘^“^))}  and  the 
generalized  linear  contour  of  q  given  q  by  {(A‘'~',  Pa(A‘^~'))},  then  at  any  A‘^“^  G  Ux, 

(3.4)  [9a(A"-')  -  .9a(A‘'-')]  [cP^dxJ{y,p)]  =  -  . 

By  assumption,  dx^qiX)  =  <p^ dxjf  {y,  p)  5^  0,  so  we  rewrite  (3.4)  as 

[gx{X‘^-')-9x{X’^-^)]=C(Tl,cP), 

where  =  —p^dxjf{y,p.),  is  a  nonzero  constant  determined  entirely  by  the  refer¬ 
ence  point  (j/,p).  Thus,  if  we  define 

||I7aII  =  sup  IIA-PII2, 

A€[/a 

where  ||  II2  denotes  the  standard  Euclidean  norm,  then  as  ||f/All  0,  ||R||2  — ^  0, 
which  implies  that  |5a(A‘^~^)  —  9a(A‘^“^)|  — >  0.  □ 
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Global  linecirization  of  the  linear  functional.  We  extend  the  local  lineariza¬ 
tion  technique  to  obtain  a  global  piecewise-linear  approximation  of  the  linear  func¬ 
tional  over  all  of  A.  We  first  define  a  partition  of  cells  of  A.  The  geometry 

is  immaterial,  as  long  as  we  can  integrate  constant  functions  over  the  cells.  We  apply 
the  local  linearization  technique  described  above  for  each  cell,  and  defining 


Ifli 


if  A  ^  3ij 
if  Bi, 


we  obtain  a  global  piecewise-linear  approximation  q{X)  to  q{X)  defined  by 


M 

(3.5)  q{X)  :=  ^  (A  -  p-i)))  1b, (A), 


where  pi  is  the  reference  parameter  value  chosen  in  cell  Bi. 

Theorem  3.4.  As  ||fli||  — >  0  (or  as  M  oo  when  the  number  of  sample  points 
are  distributed  uniformly),  the  generalized  linear  contour  converges  pointwise  to  the 
generalized  contour. 

Proof  For  the  finite  system  of  nonlinear  equations,  we  have 

Vq{pi)  =  <pj  Dxfiya  Pi), 

where  cpi  solves  the  linearized  adjoint  problem  using  the  reference  point  {yi,Pi).  If 
we  let  —  (TZi,(pi)  denote  the  higher-order  terms  neglected  in  the  linearization  of  q{X) 
in  cell  Bi,  then  we  can  write  the  error  of  the  piecewise-linear  approximation,  e(A)  = 
9(A)  -9(A),  as 


M 

e(A)  =  -X^(7^i,.^i)ls,(A). 

i=l 

The  generalized  linear  contour  of  q  given  q  is  a  collection  of  hyperplanes  in  A.  Using 
the  same  notation  as  above. 


M 

|fl'A(A‘'"^)  -9a(A'^-^)|  <  C'^\{'Jli,cj>i}\,  =  min  {\(t>J  dxJiVi,  Pi)\]  ■ 

i=l  * 

This  yields  the  convergence  result.  □ 

The  transverse  parameterization  (TP)  for  the  generalized  linear  contours  is  con¬ 
structed  using  q  in  the  same  way  as  described  in  the  proof  of  Theorem  3.2.  Since  q  is 
a  piecewise-linear  surface,  the  resulting  TP  is  a  piecewise-linear  curve  in  A. 

Examples.  We  illustrate  the  convergence  of  generalized  linear  contours  to  true 
contours  in  the  two  examples  below. 

In  the  first  example,  we  suppose  that  g(Ai,  A2)  =  AiA2exp  [— (Af  -h  1.25A2  —  1)] 
over  [0,2]  x  [0,2].  We  approximate  q  over  a  uniform  partition  {Bi}  of  [0,2]  x  [0,2] 
into  squares,  and  we  linearize  around  the  midpoint  of  each  Bi  to  form  q  in  (3.5).  We 
plot  vsirious  contour  curves  and  two  TP’s  on  each  plot.  The  results  are  summarized 
in  Figure  3.3. 

For  a  second  example,  we  suppose  9(Ai,A2)  =  exp[cos(A])  -l-sin(A2)]  on  [— 27r  — 
0.1,277  -I-  0.1]^.  We  proceed  as  above  to  obtain  the  numerical  results  summarized  in 
Figure  3.4. 
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Fig.  3.3.  Contours  of  q  using  5x5  cells  (top  left),  lOx  10  cells  (top  right),  25  x  25  cells  (bottom 
left),  and  50  x  50  cells  (bottom  right).  The  TP  is  created  xtsing  the  algorithm  outlined  in  the  proof 
of  its  existence  and  is  denoted  by  the  circle-dotted  and  plus-dotted  lines.  The  circle-dotted  line  is 
constructed  from  the  maximum  of  q{X)  and  follows  the  negative  direction  of  the  gradient  of  q(X), 
and  the  plus-dotted  line  is  constructed  from  the  minimum  of  q(A)  and  follows  the  direction  of  the 
gradient. 


3.2.  Computing  the  parameter  probability  density.  We  now  explain  how 
to  use  the  unique  solution  to  the  inverse  problem  in  the  space  of  generalized  contours  to 
compute  an  approximation  of  the  probability  density  cta  on  A.  We  first  observe  if  /  = 
[91,92]  C  is  an  event  with  probability  P{I)  =  P{q{X)  €  /),  then  this  corresponds 
to  a  measurable  set  in  A  that  is  defined  as  the  set  of  all  contours  obtained  by 
Prom  the  basic  assumptions  of  smoothness  and  the  nonintersecting  property  of  the 
contours,  the  set  of  all  contours  is  a  set  in  A  that  is  contained  between  the  two 
contours  defined  by  9"'^(9i)  and  q~^{q2)  (or  possibly  one  of  these  contours  and  the 
boundary  of  A).  We  assign  this  set  the  probability  P{I).  It  follows  immediately  that 
we  can  define  the  inverse  into  the  set  of  generalized  contours  for  a  given  distribution 
of  9(A)  uniquely. 

Theorem  3.5.  Suppose  f  is  smooth  in  (3.1)  and  9(A)  is  a  linear  functional  of 
the  solution  to  (3.1).  If  q{\)  is  a  random  variable  with  distribution  Fq{q{X)),  then  for 
a  fixed  TP  in  A,  the  distribution  of  the  intersections  of  the  generalized  contours  on 
the  TP,  which  is  a  random  variable,  is  unique. 

The  probability  of  a  measurable  set  in  A  is  determined  by  the  contours  the  set 
contains  and  the  amount  of  each  contour  the  set  contains  and  the  probabilities  of 
those  contours.  The  parameter  volunie  measure  pa  determines  the  contours  a  given 
set  contains  and  the  amount  of  each  contour  the  set  contains. 

3.2.1.  Computational  mesisure  theory.  The  method  we  develop  for  comput¬ 
ing  an  approximate  probability  distribution  is  based  on  constructions  used  in  measure 
theory. 
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FlCi.  3.4.  Contours  of  q  using  7  cells  (top  left),  lOx  10  cells  (top  right),  25  X  25  cells  (bottom 
left),  and  50  x  50  cells  (bottom  right).  The  TP  is  created  using  the  algorithm  outlined  in  the  proof 
of  its  existence  and  is  denoted  by  the  square-dotted  and  circle-dotted  lines.  The  square-dotted  line 
is  constructed  from  the  maximum  of  q(X)  and  follows  the  negative  direction  of  the  gradient  of  q(X), 
and  the  circle-dotted  line  is  constructed  from  the  minimum  of  q(\)  and  follows  the  direction  of  the 
gradient. 


Theorem  3.6.  Given  a  measurable  set  A  C  A,  we  can  approximate  P{A)  using 
a  simple  function  approximation  to  which  requires  only  calculations  of  volumes 

in  A. 

The  constructive  proof  below  parallels  Algorithm  1  for  approximating  the  prob¬ 
ability  of  a  measurable  set  A  C  A. 

Proof.  For  A  restricted  between  any  two  contours  induced  by  a  subinterval  of  a 
partition  of  P  as  in  Algorithm  1,  q(X)  is  approximately  a  uniformly  distributed  random 
variable.  Suppose  that  {9j}^o  is  a  partition  of  V  such  that  qo  <  qi  <  ■  ■  ■  <  qN,  and 
if  Ej  =  then  V  =  UjEj.  Let  Aj  =  {A 1 9(A)  &  Ej}.  We  assume  that 

A  =  UjAj.  The  probability  of  Aj  is  given  by 


<7^{X)dfi/^{X). 


We  can  compute  this  probability  because  of  the  1-1  correspondence  between  the  con¬ 
tours  and  output  values,  i.e.,  P(Aj)  =  P{Ej)  =  pT}{q)  dp.j}{q) .  Therefore,  we  have 
a  simple  function  approximation  to  a\{\)  given  by 


(Ta(A)  «  aA,N(A) 


E 


P{Aj) 

liA(Aj) 


1.4, (A). 
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Algorithm  1.  Approximate  Parameter  Probability  Distribution  Method 
Fix  simple  function  approximation,  to  Pvi<l)  that  induces  a  partition 

of  ^  where  for  each  i  =  1, . . . ,  N{M),  p^\q)  is  constant  on  each 
subinterval  [gj_i,gi) 

[ft-i , iZi)  induces  a  partition  of  A  by  generalized  contours  and 
denotes  this  partition 
Let  Pj  denote  probability  of  Aj  given  by 

Partition  A  with  small  cells 
for  i  =  1, . . . ,  M'  do 

for  j  =  1, . . . ,  N{M)  do 

Calculate  ratio  of  volume  of  6;  D  Aj  to  volume  of  Aj,  store  in  matrix  Vij 
end  for 

Set  P{bi)  equal  to  VijPj 

end  for 

Given  event  A  C  A,  estimate  P{A)  using 

•  inner  sums,  i.e.,  sum  of  P{bi)  for  alH  S  /  C  {1, . . . ,  M'}  such  that  b,-  C  A, 

•  outer  sums,  i.e.,  sum  of  P{bi)  for  alH  €  /  C  {1, . . . ,  M'}  such  that  b,  O  A  5^  0, 

•  average  of  inner  and  outer  sums,  or 

•  /^o-A,M'(A)dliA(A),  where  ffAM'W  =  Eili  P{f>i)lbi{A). 


Given  event  A  c  A,  we  use  the  law  of  total  probability  to  write 

N 

P{A)  =  J2p(A\Aj)P{Aj). 

j=l 

Using  the  above  simple  function  approximation  to  the  parameter  density,  we  have 

PIAIA  )  -  ■■  -  PAjAHAj) 

^  P{Aj)  h,dpA{X)  Pa{Aj)  ■ 

Hence,  the  probability  P{\  6  A  |  q{\)  6  Ej)  =  P{A  |  Aj)  can  be  calculated  from  the 
volume  measure  on  model  space  since  it  depends  only  on  measurable  sets  in  A  if  we 
use  the  approximation  q{X)  ~  l^{Ej)  for  A  S  Aj.  The  value  is  the  ratio  of  volume 
of  A  n  Aj  to  the  volume  of  Aj.  Since  the  density  on  data  space  is  a  nonnegative 
integrable  function,  there  exists  a  sequence  of  simple  functions  {p^^{9)}m=i  with 

22M  j 

p'v\^)= 

k=\  ^ 

and  lM,k  =  [(^  —  l)/2^,  k/2^].  We  first  observe  that  the  partition  {lM,k}  induces  a 
partition  of  V.  Also,  we  observe  that  p^-^\q)  — >  Pv{(i)  in  L*  as  M  ->  00  by 

the  monotone  convergence  theorem,  and  for  any  measurable  set  E  CD, 

22M 

/  P^vH‘1)^P'd{q)  =  ^^]^Pv{EM,k<^P)  ->  Pv{E)  as  M  ->  00. 

k=l 
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Thus,  we  can  approximate  the  value  of  P{A  |  Aj)  the  ratio  of  volume  of  A  PI  Aj  to 
the  volume  of  Aj  obtained  from  the  volume  measure  on  model  space  if  the  induced 
partitions  {/Ij}  come  from  a  sufficiently  fine  partition  {Ej}  of  data  space  so  that  the 
distribution  of  q{X)  for  A  e  is  approximated  hyU{Ej). 

Since  P{A)  =  sup  {P{E)  :  K  C  A,  K  compact}  and  P{A)  =inf  {P{U)  ■  AcU, 
U  open},  we  can  estimate  P{A)  using  the  inner  and  outer  sums  described  by  Algo¬ 
rithm  1.  □ 

Remark  3.4.  If  the  set  A  has  not  (yet)  been  specified,  we  may  still  carry  out  the 
first  part  of  Algorithm  1  to  obtain  a  discretized  approximation  of  the  measure  P  on 
model  space. 

Remark  3.5.  The  set  of  cells  in  Algorithm  1  is  introduced  purely  for 

computational  purposes  and  is  not  necessary  to  the  approximation  of  P{A).  We 
choose  {6i}^i  in  order  to  approximate  P{A),  for  any  event  A  C  A,  without  carrying 
out  the  calculations  in  the  nested  loops  of  Algorithm  1  for  each  new  event.  If  we  are 
interested  only  in  one  event,  A  C  A,  then  we  might  skip  the  step  of  partitioning  A 
by  {5i}i_i  and  replace  the  step  in  the  nested  loop  by  the  following:  Calculate  ratio 
of  volume  of  A  n  A,  to  volume  of  A,-,  store  in  vector  V).  We  may  then  approximate 

PiA)  by  ViPi- 

Remark  3.6.  Note  that  as  we  refine  the  partition  {Ej}  on  the  data  space,  which 
in  turn  refines  the  partition  {Aj}  on  model  space,  we  should  consider  refining  the 
mesh  that  defines  the  partition  {6i}  on  model  space.  The  reason  is  that  we  assign  a 
proba.biIity  P{bi)  to  each  cell  bi  that  in  essence  reapproximates  the  simple  function 
approximation. 


o'a(A)  w  <ta,n(A) 


E 


3=1 


P(Ai) 

RA{Aj) 


1a,(A). 


by  the  new  simple  function 


M' 

(ta(A)  w  ua.W'CA)  = 

i=l 


PM. 

RA(bi) 


l6.(A). 


If  the  partition  {bi}  remains  fixed  as  the  approximation  of  Pviq)  by  simple  functions 
is  refined  by  the  partition  {Ej},  then  the  representation  of  (7a(A)  as  a  simple  function 
converges  with  respect  to  the  fixed  {bi}.  When  choosing  {6,},  we  should  consider  that 
a  cell  bi  might  be  large  relative  to  the  Aj  that  it  intersects,  i.e.,  6,  might  intersect  many 
Aj.  When  this  is  the  case,  estimating  the  probability  over  bi  by  a  constant  P{bi)  might 
not  be  an  appropriate  approximation.  In  general,  it  is  not  computationally  demanding 
to  estimate  an  appropriate  size  of  the  bi. 

Observations  on  simple  function  approximations.  The  use  of  simple  func¬ 
tion  approximations  of  a  probability  density  is  sufficiently  unusual  in  the  context  of 
stochastic  analysis  of  differential  equations  as  to  justify  comment.  Simple  function 
approximations  form  the  basis  for  classic  measure  theory  because  they  yield  several 
benefits,  including 

•  Simple  function  approximations  are  widely  applicable  under  minimal  assump¬ 
tions  on  the  density  being  approximated.  As  the  examples  below  suggest, 
probability  densities  solving  inverse  problems  appear  to  be  highly  complex  in 
general. 
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•  The  convergence  analysis  for  simple  function  approximations  is  also  widely 
applicable.  This  contrasts  with  sampling  techniques  such  as  Markov  chain 
Monte  Carlo  methods  whose  convergence  properties  are  stochastic  and  can 
be  highly  sensitive  to  properties  of  the  problem. 

Though  we  have  not  exploited  the  fact  in  this  paper,  simple  function  approximations 
also  offer  signiflcant  benefits  for  stochastic  sensitivity  analysis  of  differential  equations 
[12,  13,  9,  10,  7].  In  particular,  combining  a  simple  function  approximation  with 
sensitivity  derivatives  of  a  quantity  of  interest  with  respect  to  parameters  provides 
both  a  natural  dimension  reduction  mechanism  and  the  basis  for  adaptive  sampling. 

Of  course,  a  significant  issue  with  simple  function  approximations  is  the  nominal 
dependence  of  accuracy  on  the  dimension  of  the  parameter  space.  This  may  be  a 
consequence  of  the  common  approach  of  using  hyper-rectangular  cell  discretizations  of 
the  underlying  space  combined  with  the  unfortunate  growth  in  diagonal  dimension  of 
hyper-rectangles  as  dimension  increases,  though  we  report  on  some  inconclusive  results 
of  using  radial  basis  functions  in  [12].  In  our  experience,  the  effects  of  dimension  are 
nominal  up  to  dimensions  of  8-10,  and  we  have  effectively  used  the  piecewise  constant 
approximations  to  dimensions  of  order  15-18.  We  note  that  this  is  effer.tive  dimension. 
By  exploiting  dimension  reduction,  the  nominal  dimension  of  the  parameter  space  may 
be  higher. 

4.  Examples.  We  apply  the  new  method  to  solve  inverse  problems  associated 
with  a  variety  of  maps.  We  first  consider  three  constrained  geometric  optimization 
problems.  Wc  then  discuss  examples  involving  a  nonlinear  ordinary  differential  equa¬ 
tion  and  a  nonlinear  elliptic  partial  differential  equation  with  two  parairieters.  Finally, 
we  discuss  the  determination  of  regions  with  high  probability. 

In  the  following  examples,  we  have  chosen  the  uniform  Lebesgue  measure  for  the 
parameter  volume  measure  and  often  impose  a  normal  distribution  on  the  output 
quantity  of  interest.  The  first  choice  is  made  because  it  is  commonly  the  (implicit) 
default,  e.g.,  in  Bayesian  inference.  The  imposition  of  a  normal  distribution  on  the 
output  is  also  a  common  choice.  In  our  examples,  it  serves  the  purpose  of  illustrating 
the  complex  nature  of  the  inverse  probability  measure  that  results  even  when  a  normal 
distribution  has  been  imposed  on  the  output.  However,  we  emphasize  that  neither  of 
these  choices  are  important  in  terms  of  implementing  the  numerical  solution  method, 
which  is  readily  applied  for  any  distributions. 

4.1.  A  2-dimensional  nonlinear  function.  We  consider  the  map  determined 
implicitly  as  the  solution  of  the  finite-dimensional  nonlinear  system  of  equations  given 

by 

Aixf -I- =  1, 

X?  -  A2x|  =  1, 

where  Ai  and  A2  are  the  parameters.  Geometrically,  solutions  x  =  (xi,X2)^  to  the 
system  represent  intersections  of  the  hyperbola  and  ellipse.  The  quantity  of  interest 
is  the  second  component  of  the  solution  in  the  first-quadrant,  i.e.,  q{X)  =  g(x(A))  = 
X2=(x, where  ip  =  (0, 1)^.  According  to  (3.3),  the  adjoint  problem  is 

/  2pi7/i  2yi  \  , 

y  27/2  -2p2y2  ) 

where  p  =  (/ii,/ti2)^  and  y  =  (7/1, 7/2)^  are  the  reference  parameter  and  reference 
solution  for  the  forward  problem. 


FK!.  4.2.  Illustration  of  an  application  of  Algorithm  1.  Left:  We  determine  which  contours  are 
contained  in  an  event  A  C  A.  and  how  much  of  each  contour  is  inside  the  event.  Right:  We  estimate 
the  probabilities  of  small  cells  contained  in  the  event  and  use  an  inner  and  outer  estimate  to  obtain 
an  approximation  of  the  probability  of  the  event  A. 

In  order  to  create  an  interesting  example,  we  choose  A  =  [.79,  .99]x  [1— 4.5\/0.1, 1  + 
4.5-\/o7T]  based  on  a  sensitivity  analysis  of  the  forward  problem  in  [23].  We  use  six- 
unifonnly  spaced  mesh  points  in  both  the  Ai  and  A2  directions  of  A  to  create  cells 
partition  A.  We  use  the  centroid  of  each  cell  as  the  reference  parame¬ 
ter  Hi  =  {hi,u  in  that  cell  and  solve  the  forward  problem  to  obtain  reference 

solutions  Ui  =  (2/1,1, 2/2,1)^  these  points,  and  then  solve  for  the  generalized  Green’s 
vector  cfi  =  (i^i.i,  02,t)^  at  the  reference  point  (/ii,2/«)-  According  to  (3.5),  we  obtain 
a  global  piecewise-linear  approximation  q  to  q  defined  as 

qW  :=  (2^2,;  +  (A  -  ftif  ^  j  iBi(A). 

We  assume  that  the  output  data  is  a  random  variable  with  normal  distribution  on 
the  data  space  defined  by  (}(A)  (Figure  4.1).  We  assume  ha  is  tfie  Lebesgue  measure. 
We  implement  Algorithm  1  to  calculate  P{bi)  for  small  cells  for  each  fine  partition  of 
A  and  determine  the  probabilities  of  events  A  C  A.  We  plot  the  results  in  Figure  4.2. 
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Fig.  4.3.  We  use  15  x  15  x  15  small  cells  in  Algorithm  1.  We  plot  the  approximate  distribution 
from  several  angles.  Left:  A  3-dimensional  view.  Right:  The  same  ^-dimensional  view  rotated  90 
degrees  clockvAse. 
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Fig.  4.4.  IVe  use  15  x  15  x  15  small  cells  in  Algorithm  1.  IVe  plot  the  approximate  distribution 
from  several  angles.  Left:  The  original  3-dimensional  view  rotated  180  degrees  clockwise.  Right: 
The  original  3-dimensional  view  rotated  270  degrees  clockwise. 


4.1.1.  A  three-parameter  geometric  constrained  optimization  problem. 
The  map  to  be  inverted  is  determined  by  minimizing  the  distance  to  the  point 
(1,  —1, 1)  among  points  constrained  to  lie  on  the  surface  <7  =  4,  where 

g{xi,X2,  xs;  Ai,  A2,  A3)  =  Aiif  +  A2X2  +  Aaij. 


Geometrically,  the  parameters  determine  the  shape  of  the  ellipsoid  that  defines  the 
constraint.  Using  the  method  of  Lagrange  multipliers  we  set  up  a  system  of  nonlinear 
equations  with  four  state  variables  and  three  parameters.  We  take  the  quantity  of 
interest  as  the  first  state  variable,  which  geometrically  is  interpreted  as  the  first  spatial 
coordinate  in  the  solution  to  the  constrained  minimization  problem.  We  set  A  = 
[.35,  .65]  X  (.28,  .52]  x  [.42,  .78]  and  construct  a  piecewise-linear  approximation  using 
125  points  in  A.  We  assume  a  normal  distribution  on  g(A)  and  taking  the  underlying 
parameter  volume  measure  //.a  to  be  a  normalized  Lebesgue  measure.  We  use  3375 
small  cells  {B,}  in  Algorithm  1.  We  plot  the  probabilities  at  the  midpoint  of  each 
cell  with  the  color  of  the  point  determined  by  the  probability  of  the  small  cell  in 
Figures  4. 3-4. 4. 
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Fig.  4.5.  We  use  15  x  15  x  15  x  18  small  cells  in  Algorithm  1.  We  plot  “snapshots”  of 
the  approximate  probability  distribution  for  three  values  of  the  fourth  parameter.  Left:  The  fourth 
parameter  is  set  at  its  minimum  value.  Middle:  The  fourth  parameter  is  set  at  its  midpoint  value. 
Right:  The  fourth  parameter  is  set  at  its  maximum  value.  Notice  how  the  probabilities  vary  in  space 
as  we  vary  the  fourth  parameter. 


4.1.2.  A  four-parameter  geometric  constrained  optimization  problem. 
The  map  to  be  inverted  is  determined  by  minimizing  the  distance  to  the  point  (5,  5,  5) 
among  points  constrained  to  lie  on  the  intersection  of  the  surfaces  5=1  and  h  =  0, 
where 


giXuX2,X3-,Xi,X2)  =  Xixj  +  X2xl  -  xj, 

h(xi,  X2,  X3',  As,  A4)  =  AiXi  +  A2X2  —  X3. 

Geometrically,  5  =  1  defines  a  hyperboloid  of  one  sheet  and  h  =  0  defines  a  plane 
through  the  origin,  and  the  intersection  of  the  two  constraints  is  a  closed  curve.  Using 
the  method  of  Lagrange  multipliers  we  set  up  a  system  of  nonlinear  equations  with 
five  state  variables  and  four  parameters.  We  take  the  quantity  of  interest  as  the  first 
state  variable,  which  geometrically  is  interpreted  as  the  first  spatial  coordinate  in  the 
solution  to  the  constrained  minimization  problem.  We  set  A  =  [1.4, 2.6]  x  [.7, 1.3]  x 
[1.4, 2.6]  X  [.35,  .65]  and  construct  a  piecewise-linear  approximation  using  750  points 
in  A.  We  assume  a  normal  distribution  on  5(A)  and  take  /za  to  be  a  normalized 
Lebesgue  measure.  We  use  60750  small  cells  {6f}  in  Algorithm  1.  Displaying  a  4- 
dimensional  distribution  is  problematic.  We  plot  “snapshots”  of  the  approximated 
probability  density  for  three  fixed  A4  values  in  Figure  4.5. 

4.1.3.  A  two-parameter  ordinary  differential  equation.  We  now  study  the 
nonlinear  ordinary  differential  equation 

J  X  =  Ai  sin(A2x),  0  <  t  <T, 

\x(0)  =  l. 

The  linear  functionals  (quantities  of  interest,  q{X))  we  study  take  the  form 

5(A)  =  (x{£),V'(t)>  =  /  {x{s,X),ip(s))ds, 

Jo 

and  we  take  the  quantity  of  interest  to  be  the  average  value  of  x(£)  over  the  time 
interval  [0,2].  Thus,  we  set  •0(0  =  l[o,2](0/2.  and  the  generalized  Green’s  function 
0(£)  solves  the  adjoint  problem, 

('-0(£)-dT(£)0(£)=0(£),  T>t>0, 

U{T)=iP{n 
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Fig.  4.6.  Left:  The  global  piecewise-linear  approximation  to  q(X)  obtained  using  Algorithm  1. 
The  cells  in  A  illustrate  the  coarse  discretization  of  this  space  for  the  forward  problem  of  obtaining 
a  piecewise-linear  approximation  and  the  circles  in  each  cell  indicate  the  reference  parameter  used 
to  linearize  q(X)  in  that  cell.  We  assume  a  normal  distribution  for  q(X)  and  use  a  grid  of  40  x  40 
small  cells. 


where  A{L)  :=  ft))  is  the  Jacobian  of  /  =  Aj  siu(A2x)  evaluated  at  y(t;  ft),  /i  is  a 

reference  parameter,  and  is  the  solutioii  to  (4.1.3)  for  this  reference  parameter. 

Compare  this  to  (3.3).  Using  substitution,  integration  by  parts,  and  Taylor’s  theorem, 
we  arrive  at  a  linear  approximation  to  q{\)  for  parameters  near  ft,  and  analogous  to 
the  finite  dimensional  case,  we  obtain  a  global  piecewise-linear  approximation  to  q(\) 
over  A  =  [.8, 1.2)  x  [.l,7r  —  .1]  shown  in  Figure  4.6. 

Remark  4.1.  There  can  be  substantial  error  in  the  reference  solutions  and  gra¬ 
dients  used  when  applying  the  method  to  differential  equations  whose  solutions  must 
be  approximated  numerically,  and  we  study  the  effect  of  these  errors  in  the  second 
paper  [4]. 

4.1.4.  A  two-parameter  elliptic  partial  differential  equation.  We  now 
study  a  nonlinear  elliptic  partial  differential  equation 

f-Au  =  Ai(u  -  A2)^,  (x,y)  €  n  =  (0, 1]  X  (0,1), 

|n  =  0,  (s^.y)  €  dQ. 

The  quantities  of  interest,  y(A),  take  the  form 

q(X)  =  =  /  u{x,y)ip{x,y)dxdy, 

Jn 

and  we  take  the  quantity  of  interest  to  be  the  average  value  of  u  over  $1.  Thus,  we  set 
ip{x,y)  =  1,  and  the  generalized  Green’s  function  4>{L)  solves  the  adjoint  problem, 

f-A0  -  =  ■0,  (x,y)€n, 

1^  =  0,  (x,y)€dQ, 

where  A  :=  y;  /r);  y)  is  the  Jacobian  of  /  =  Ai  exp(A2'u)  evaluated  at  w(x,  y;  y), 

/i  is  a  reference  parameter,  and  w(x,y,y)  is  the  solution  to  (4.1.4)  for  this  reference 
parameter.  Using  substitution,  the  weak  form  of  (4.1.4),  and  Taylor’s  theorem,  we 
arrive  at  a  linear  approximation  to  y(A)  for  parameters  near  y,  and  just  as  with  the 
previous  examples,  we  obtain  a  global  piecewise-linear  approximation  to  y(A)  over 
A  =  [.95, 1.05]  X  [—.1,  .1]  using  Algorithm  1.  We  show  the  results  in  Figure  4.7. 
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Fig.  4.7.  Left:  Global  piecevnse-linear  approximation  to  q(X)  obtained  using  Algorithm  1.  We 
used  a  11  X  13  grid  of  coarse  cells  to  discretize  A  and  used  the  midpoint  of  each  cell  as  the  reference 
parameter  in  that  cell.  We  assume  a  normal  distribution  of  q(X)  and  we  use  a  33  X  39  grid  of  small 
cells. 


Fig.  4.8.  Left:  Generalized  contours  from  500  samples  of  q(X)  =  Ai  +  A2  generated  from  a 
N(0, 2/25)  distribution.  Middle:  The  TP  intersects  each  contour  once  and  goes  from  the  minimum 
of  q(X)  in  the  lower  left  comer  to  the  maximum  of  q(X)  in  the  upper-right  comer  of  the  plot.  Right: 
Intersections  of  contours  on  the  TP  are  marked  with  a  star  and  can  be  used  to  index  the  inverses 
and  determine  a  unique  distribution  of  the  contours  on  the  TP  using  any  consistent  indexing  scheme 


4.2.  Determining  regions  of  high  probability.  The  new  method  can  be 
applied  to  find  regions  of  high  probability.  Consider  q(\)  =  Ai  +  A2,  where  A  = 
[0, 1]  X  [0, 1].  Figure  4.8  shows  the  generalized  contours  for  500  samples  of  q{X)  taken 
from  a  N{Q,  2/25)  distribution  along  with  the  TP  and  the  intersections  of  contours  on 
the  TP.  Where  the  contours  intersect  the  TP  most  densely  corresponds  to  a  region  of 
high  probability  in  the  space  of  contours. 

We  can  locate  regions  of  high  probability  by  sorting  through  the  probability  of 
the  fine  cells  {6j}.  We  can  rank  order  these  cells  and  determine  any  cells  of  high  prob¬ 
ability.  We  can  also  determine  regions  of  neighboring  cells  that  all  have  relatively 
high  probability.  We  illustrate  using  the  four-parameter  geometric  constrained  opti¬ 
mization  problem  in  section  4.1.2.  In  Table  1,  we  list  the  ten  small  cells  with  highest 
probability.  If  we  let  the  events  {6,}  become  small,  under  a  smoothness  assumption, 
the  probabilities  of  these  events  are  related  to  the  maximum-likelihood  estimate. 

5.  Conclusion.  We  consider  the  probabilistic  inverse  sensitivity  analysis  prob¬ 
lem:  Given  a  specified  uncertainty  in  the  output  of  a  map,  determine  variations  in 
the  parameters  that  produce  the  observed  uncertainty.  We  formulate  this  inverse 
problem  using  the  law  of  total  probability.  We  describe  and  analyze  a  method  for 
computing  the  approximate  probability  density  that  solves  the  inverse  problem  and 
does  not  require  random  sampling.  Our  approach  breaks  the  solution  down  into  two 
stages: 
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Table  1 

M'e  indicate  the  location  of  the  ten  cells  with  the  highest  probabilities  for  the  example  in  sec¬ 
tion  4.1.2.  The  first  column  gives  the  probability  and  the  second  column  gives  the  dimensions  and 
location  of  the  cells.  There  are  clearly  two  distinct  regions  for  events  with  relatively  high  probability. 
In  general,  one  can  use  this  information  to  determine  where  the  largest  regions  of  highest  probability 
are  located  in  a  high- dimensional  parameter  space. 


bi  location 

0.600381927 

[2.44,2.52]  X  [1.22,1.26]  X  [2.04,2.12]  X  [0.4,0.4167] 

0.600446977 

[2.36,2.44]  X  [1.06,1.1]  X  [1.96,2.04]  X  [0.4333,0.45] 

0.600462420 

[2.44,2.52]  X  [1.18, 1.22]  X  [2.04,2.12]  X  [0.4333,0.45] 

0.600465732 

[2.36,2.44]  X  (0.98, 1.02]  x  [2.04,2.12]  X  [0.4167,0.4333] 

0.600470136 

[2.36,2.44]  X  [1.06,1.1]  X  [1.96,2.04]  X  [0.4107,0.4333] 

0.600474821 

[2.36,2.44]  X  [1.26,1.3]  X  [1.96,2.04]  X  [0.4167,0.4333] 

0.600501752 

[2.36,2.44]  X  (0.98, 1.02]  X  (2.04,2.12]  X  ]0.4333,  0.45) 

0.600463048 

[1.4, 1.48]  X  [1.18, 1.22]  X  [1.64,1.72]  X  [0.3833,0.4] 

0.600464252 

[1.4,1.48]  X  [1.18,1.22]  X  (1.64,1.72)  X  [0.35,0.3667] 

0.600468545 

(1.4, 1.48]  X  (1.18, 1.22]  X  (1.64, 1.72]  X  (0.3667,  0.3833] 

1.  Construct  an  approximate  representation  of  the  set- valued  inverse  solution  of 
the  ill-posed  deterministic  inverse  problem. 

2.  Approximate  the  density  on  the  parameter  space  that  corresponds  to  the 
set-valued  inverse  and  the  observed  output  density  using  a  simple  function 
representation. 

We  illustrate  the  method  and  several  features  using  a  variety  of  examples. 

In  (4]  we  present  numerical  analysis  of  discretization  error,  e.g.,  in  evaluating  the 
model  by  numerical  solution  and  in  finite  sampling.  In  [5],  we  discuss  the  problem  of 
dealing  with  multiple  quantities  of  interest,  which  has  application  to  data  assimila¬ 
tion  and  “cascaded”  uncertainty  in  operator  decomposition  solution  of  multiphysics 
problems. 
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Abstract.  We  describe  and  test  an  adaptive  algorithm  for  evolution  problems  that  employs  a 
sequence  of  “blocks”  consisting  of  fixed,  though  non-uuiform,  space  meshes.  This  apj^roach  offers 
the  advantages  of  adaptive  mesh  refinement  but  with  reduced  overhead  costs  associated  with  load 
balancing,  re-meshing,  matrix  reassembly,  and  the  solution  of  adjoint  problems  used  to  estimate 
discretization  error  and  the  effects  of  mesh  changes.  A  major  issue  with  a  block-adaptive  approach 
is  determining  block  discretizations  from  coarse  scale  solution  information  that  achieve  the  desired 
accuracy.  We  describe  several  strategies  to  achieve  this  goal  using  adjoint-based  a  posteriori  error 
estimates  and  we  demonstrate  the  behavior  of  the  proposed  algorithms  as  well  as  several  technical 
issues  in  a  set  of  examples. 

Key  words,  a  posteriori  error  analysis,  adaptive  error  control,  adaptive  mesh  refinement, 
adjoint  problem,  discontinuous  Galerkin  method,  duality,  generalized  Green’s  function,  goal  oriented 
error  estimates,  residual,  variational  analysis 

AMS  subject  classifications.  65N15,  65N30,  65N50 

1.  Introduction.  We  describe  and  test  an  adaptive  algorithm  for  evolution 
problems  that  we  call  “blockwisc  adaptivity”.  This  approach  employs  a  sequence 
of  ''blocks"  consisting  of  fixed,  though  non-uniform,  space  meshes,  and  is  motivated 
by  considerations  of  efficiency  and  accuracy.  We  balance  the  goal  of  achieving  de¬ 
sired  accuracy  using  discretizations  with  relatively  few  degrees  of  freedom  against  the 
computational  costs  associated  with  load  balancing,  rc-meshing,  matrix  reassembly 
and  in  particular  the  cost  of  error  estimation.  A  block  adaptive  strategy  reduces  the 
number  of  mesh  changes  that  must  be  treated,  which  reduces  the  amount  of  com¬ 
putational  time  spent  on  re-meshing,  assembly,  and  load  balancing,  and  makes  the 
problem  of  quantifying  the  effects  of  mesh  changes  on  accuracy  computationally  fea¬ 
sible.  A  block  adaptive  strategy  also  provides  a  natural  coarse  scale  discretization 
on  which  to  solve  the  adjoint  problem  used  to  compute  global  a  posteriori  error  esti¬ 
mates.  This  reduces  the  twin  computational  difficulties  of  storing  a  fine  scale  forward 
solution  in  order  to  form  the  adjoint  problem  and  solving  the  adjoint  problem  on  that 
fine  scale  discretization.  However,  a  major  issue  with  a  block-adaptive  approach  is 
determining  block  discretizations  from  coarse  scale  solution  information  that  achieve 
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the  desired  accuracy  and  efficiency.  We  describe  several  strategies  to  achieve  this  goal 
using  adjoint-based  a  •posteriori  error  estimates. 

To  focus  the  discussion,  we  consider  a  reaction-diffusion  equation  for  the  solution 
u  on  an  interval  [0,T], 

{u  -  V  •  (e(x,  t)Vu)  =  f{u,x,t),  (x,  i)  €  n  X  (0,T), 
u(x,  <)=0,  (a:,  i)  6  X  (0,  T],  (1.1) 

u(x,0)  =  Uo(x),  xefl, 

where  fl  is  a  convex  polygonal  domain  in  S'*  with  boundary  dfi,  ii  denotes  the  partial 
derivative  of  u  with  respect  to  time,  and  there  is  a  constant  £  >  0  such  that 

e(x,  t)  >  c,  a;  e  n,  i  >  0. 

We  also  assume  that  e  and  /  have  smooth  second  derivatives.  The  algorithms  in  this 
paper  generalize  to  problems  with  different  bomidary  conditions,  convection,  nonlinear 
diffusion  coefficients,  as  well  as'systeins,  see  [17,  15]. 

In  terms  of  adaptive  mesh  refinement,  the  interesting  situation  is  a  solution  of 
(1,1)  that  exhibits  “regionalized”  behavior  in  space  and  time.  Considerations  of  effi¬ 
ciency  suggests  that  time  steps  and  space  me.shas  should  be  locally  refined  to  match 
the  regional  behavior,  sec  the  plot  on  the  left  in  Fig.  1.1.  Classic  adaptive  mesh  re¬ 
finement  can  be  described  as  a  constrained  optimization  problem,  e.g.,  determine  a 
discretization  using  the  fewest  degrees  of  freedom  that  yields  a  solution  satisfying  a 
given  error  criterion.  In  general,  it  is  impossible  to  determine  a  closed-form  solution 
of  this  optimization  problem.  An  adaptive  algorithm  is  an  iterative  procedure  for 
determining  a  nearly  optimal  solution. 


Fig.  1.1.  The  evolution  of  a  traveling  front  solution.  Left:  A  computation  using  space  meshes 
chosen  by  a  standard  adaptive  strategy  to  control  the  spatial  residual  error  at  each  time  step.  This 
entails  re~meshing,  re-assembly,  load  balancing,  and  projecting  the  solution  on  a  new  mesh  at  each 
step.  Right:  The  uniform  mesh  that  is  required  to  achieve  the  same  control  over  the  residual.  The 
computation  is  assembled  and  load  balanced  only  once. 

We  present  a  generic  adaptive  algorithm  in  Algorithm  1.1.  An  adaptive  compu¬ 
tation  is  generally  started  with  an  initial  coarse  mesh.  The  adaptive  algorithm  is  then 
applied  “real-time”  as  the  integration  proceeds  so  as  to  generate  a  new  space  mesh 
for  each  new  time  step,  where  the  new  space  mesh  is  based  on  (or  adapted  from)  the 
mesh  for  the  current  time  step.  In  practice,  the  remeshing  may  be  applied  on  intervals 
of  a  small  number  of  steps. 

While  adaptive  mesh  refinement  is  appealing  on  an  intuitional  level,  there  are. 
serious  issues  facing  its  use  for  evolution  problems  including  the  following. 
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Algorithm  1.1  Generic  Adaptive  Algorithm  for  an  Evolution  Problem 
1:  Choose  an  initial  coarse  mesh  and  time  step 
2:  while  the  final  time  has  not  been  reached  do 

3:  Compute  a  numerical  solution  using  the  current  time  step  and  space  mesh 

4:  Estimate  the  error  of  the  computed  solution 

5:  while  the  error  estimate  is  too  large  do 

6:  Estimate  local  error  contributions  and  adapt  in  space 

7:  Estimate  local  error  contributions  and  adapt  in  time 

8:  Compute  a  numerical  solution  using  the  new  time  step  and  space  mesh 

9:  Estimate  the  error  of  the  computed  solution 

10:  end  while 

11:  Increment  time  by  the  accepted  time  step 

12.  end  while 


1.  Accuracy  Each  spatial  mesh  change  requires  a  projection  of  the  numerical 
solution  onto  the  new  mesh,  and  this  can  affecl.  accuracy.  In  fact,  this  can 
destroy  convergence  altogether,  see  [8]. 

2.  Overhead  Costs  Changing  the  spatial  discretization  requires  generating  a 
new  mesh  and  reassembly  of  matrices.  Significant  mesh  changes  require  a 
redistribution  of  unknowns  among  the  processors  to  achieve  load  balancing. 
All  of  these  tasks  are  computationally  intensive. 

3.  Coarsening  Un-refiueineiit  or  coarsening  of  a  mesh  involves  loss  of  informa¬ 
tion  about  a  numerical  solution  that  cannot  be  recovered.  Currently,  there  is 
no  theory  for  coarsening  that  guarantees  that  there  is  no  loss  of  accuracy. 

4.  Global  Error  Estimation  Efficient  adaptive  mesh  refinement  requires  ac¬ 
curate  error  estimates  of  the  true,  global  error,  but  cancelation  of  errors  over 
both  space  and  time  makes  choosing  adapted  meshes  problematic. 

Using  a  fixed  spatial  mesh  eliminates  the  first  three  is-sues.  But,  the  scale  required  of 
the  mesh  is  determined  by  the  finest  scale  required  in  any  region  where  discretization 
impacts  global  accuracy,  see  Fig.  1.1.  This  necessarily  increases  computational  time 
and  solver  costs  and  memory  limits  may  make  it  impossible  to  use  the  necessary 
uniform  mesh. 

In  this  paper,  we  propose  a  “blockwisc”  adaptive  algorithm  that  employs  nonuni- 
fonn  meshes  that  remain  fixed  for  discrete  period  of  times,  or  “blocks”,  sec  Fig.  1.2. 
With  the  proper  implementation,  this  strategy  addresses  the  following  key  issues. 

1.  Accuracy  The  projections  onto  new  meshes  occur  at  a  relatively  small  set 
of  discrete  times.  Wc  use  a  posteriori  error  estimates  to  predict  the  effect  of 
the  projections  and  choose  overlaps  in  the  meshes  to  reduce  the  error  induced 
by  the  mesh  changes. 

2.  Overhead  Costs  Rc-meshing,  assembly,  and  load  balancing  arc  required 
only  at  the  discrete  times  demarcating  blocks. 

3.  Coarsening  There  is  no  coarsening  of  a  given  mesh  in  the  indicated  strategy. 
Mesh  changes  are  handled  purely  as  projections  between  different  meshes. 

The  idea  of  re-meshing  only  after  a  fixed  number  of  steps  is  by  no  means  new. 
However,  this  strategy  depends  critically  upon  choosing  suitable  block  discretizations, 
and  thus,  ultimately,  on  accurately  predicting  the  behavior  of  the  solution.  The  choice 
of  block  discretizations  is  a  difficult  issue  that  requires  balancing  the  inefficiency  of 
using  a  fixed  spatial  mesh  inside  each  block  against  the  gain  in  accuracy  achieved 
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Fig.  1.2.  The  evolution  of  a  solution  with  a  traveling  front  computed  using  blockwise  adaptivity 
vjith  two  blocks.  On  each  block,  the  space  mesh  is  chosen  to  maintain  the  same  level  of  control 
over  the  local  residual  as  is  achieved  in  the  computation  shown  in  Fig.  1.1.  In  addition,  there  is  a 
sufficient  degree  of  overlap  between  the  two  jneshes  (the  lightly-shaded  mesh  region)  to  insure  there 
is  no  loss  of  accuracy  in  projecting  the  solution  between  the  two  meshes.  Re-meshing,  assembly,  and 
load  balancing  is  only  required  twice,  once  for  each  block. 


by  limiting  projections  between  different  meshes  and  the  decrease  in  computational 
cost  due  to  limiting  the  number  of  times  at  which  rc-meshing,  re-assembly,  and  load 
balancing  is  required.  This  is  partly  a  computer  science  problem  of  distributing  avail¬ 
able  resources,  e.g.,  memory  and  compute  cycles,  efficiently,  and  partly  a  numerical 
analysis  problem,  e.g.,  determining  meshes  for  each  block  and  projections  between 
blocks. 

In  this  paper,  we  focus  on  the  problem  of  determining  blocks,  e.g.,  the  length 
of  times  for  each  block,  the  meshes  for  each  block  that  maintain  accuracy  in  the 
desired  information,  and  suitable  overlap  meshes  for  transitions  between  blocks  from 
the  coarse-scalc  adjoint  solutions.  The  solutions  of  these  problems  require  accurate 
e.stimates  of  the  error  in  a  specific  quantity  of  intcrc.st.  We  u.sc  a  computable  o 
posteriori  error  estimate  that  yields  robustly  accurate  estimates  of  the  error  in  a 
specified  quantity  of  interest  in  terms  of  a  sum  of  space-time  element  contribution.s, 
sec  [9,  10,  17,  15,  3,  20].  The  a  posteriori  error  estimates  arc  based  on  duality, 
adjoint  problems,  and  variational  analysis.  Accurate  error  estimates  arc  obtained 
by  numerically  solving  the  linear  adjoint  problem  related  to  the  desired  quantity  of 
interest. 

Solving  adjoint  problems  offers  computational  challenges  such  as  the  need  to  store 
the  forward  solution  in  order  to  form  the  adjoint  problem  and  the  cost  of  the  adjoint 
solve.  Our  approach  is  to  perform  the  adjoint  solves  using  relatively  coarse  scale 
discretizations  and  using  a  coarse  scale  representation  of  the  forward  solution  to  form 
the  adjoint  problem,  which  reduces  the  memory  overhead  and  the  cost  of  the  adjoint 
solve.  This  approach  is  motivated  by  the  following  observations. 

1.  Adjoint  problems  are  linear  and  often  present  fewer  numerical  difficulties  than 
the  associated  forward  problems. 

2.  Solutions  of  adjoint  problems  tend  to  vary  slowly  on  the  scale  of  the  dis¬ 
cretization,  whereas  residuals  of  forward  solutions  tend  to  oscillate  on  the 
scale  of  the  discretization 

.  3.  The  accuracy  required  of  the  adjoint  solution,  which  is  being  used  only  for 
error  estimation,  is  orders  of  magnitude  less  than  generally  desired  for  the 
forward  solution. 

An  enormous  literature  on  adaptive  methods  for  differential  equations  has  devel- 
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oped  over  nearly  six  decades  of  activity  and  the  major  developments  form  a  highly 
inter-eonneeted  web.  We  do  not  attempt  to  review  the  history  of  adaptive  methods  or 
to  present  a  eomprehensive  list  of  references.  Instead,  we  provide  only  a  short  list  of 
references  that  either  contain  further  referenees  and/or  address  eomputational  issues 
related  to  adaptive  mesh  refinement  for  evolution  problems  [8,  7,  5,  4,  18,  22,  9,  10, 
17,  19,  15,  3,  1,  23,  24,  20,  2,  14]. 

This  paper  considers  adaptive  mesh  refiiieinent  from  a  different  point  of  view  than 
much  of  the  existing  literature.  Namely,  we  are  concerned  with  trying  to  understand 
how  to  adapt  discretizations  based  on  under-resolved  solutions  on  relatively  eoarse 
discretizations  in  order  to  obtain  particular  information,  as  opposed  to  analyzing 
adaptive  mesh  algorithms  in  the  asymptotic  limit  of  mesh  refinement.  This  point  of 
view  is  important  for  many  large  scale  applications,  for  which  sueh  conditions  are 
generic.  In  §2  we  review  the  standard  a  posteriori  error  analysis  and  modify  this  for 
a  block  adaptive  strategy.  We  review  adaptive  error  control  in  §3.  and  introduce  new 
features  necessary  for  block  adaptivity  and  several  block  adaptive  strategies.  One- 
and  three-dimensional  illustrative  computational  examples  are  provided  in  §4  and  we 
draw  conclusions  in  §5. 

2.  Discretization  and  error  estimation.  We  begin  by  reviewing  discretiza¬ 
tion  and  a  posteriori  error  estimation  for  evolution  problems  and  then  describe  the 
block-wise  discretization  and  present  the  corresponding  error  estimate. 

2.1.  Discretization.  We  formulate  the  discretization  as  a  space-time  finite  ele¬ 
ment  method  because  that  is  convenient  for  deriving  a  posteriori  error  estimates  based 
on  variational  analysis.  However,  we  emphasize  that  the  estimates  can  be  extended 
to  a  wide  range  of  discretizations,  c.g.  finite  difference  and  finite  volume  methods, 
which  can  be  written  as  equivalent  finite  clement  methods. 

We  describe  two  finite  element  space-time  discretizations  of  (1.1)  called  the  con¬ 
tinuous  and  discontinuous  Galcrkin  methods,  sec  jll,  13,  12,  10,  17,  15].  We  partition 
[0,T]  as  0  =  to  <  ti  <  t2  <  ■  ■  ■  <  tn  <  ■  ■  ■  <  t^  =  T,  denoting  each  time  interval  by 
In  —  (tn-i  i  And  time  step  by  kn  =  tn  -  tn-i  and  we  construct  a  discretization  T 
of  Q  such  that  the  union  of  the  elements  in  T  is  fi  while  the  intersection  of  any  two 
elements  is  cither  a  common  edge,  node,  or  is  empty.  We  assume  that  the  smallest 
angle  of  any  element  is  bounded  below  by  a  fixed  constant.  To  measure  the  size  of  the 
elements  of  T,  we  use  a  piecewise  constant  function  h,  the  so-called  mesh  function, 
defined  so  =  diam(A)  for  A  €  T.  Similarly,  we  use  k  to  denote  the  piecewise 
constant  function  that  is  on  /„. 

The  approximations  arc  polynomials  in  time  and  piecewise  polynomials  in  space 
on  each  spaee-time  “slab”  5n  =  x  /a.  In  space,  we  let  V  C  denote  the  space 

of  piecewise  linear  continuous  functions  defined  on  T,  where  each  function  is  zero  on 
dn.  Then  on  each  slab,  we  define 


W’  =  <  w{x,  t)  :  w(x,  0  =  Vj  €  V,  (x,  t)  e 

I  j=o 


Finally,  we  let  W’  denote  the  space  of  functions  defined  on  the  space-time  domain 
n  X  [0.  T]  such  that  r;)s„  €  Wj?  for  ii  >  1.  Note  that  functions  in  W  may  be 
discontinuous  across  the  discrete  time  levels  and  we  denote  the  jump  across  by 
(u;]„  =w+  -w-  where  wj  =  lim,_n^±  w{s). 

We  use  a  projection  operator  into  V,  Pv  €  V,  c.g.  the  projection  satisfying 
{Pv,w)  —  {v,w)  for  all  w  ^  V,  where  (•,•)  denotes  the  ^2(0)  inner  product.  We 
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use  the  II  II  for  the  L2  norm.  We  also  use  a  projection  operator  into  the  pieeewise 
polynomial  funetions  in.  time,  denoted  by  7r„  :  L^(/n)  ’P’^Un),  where  is  the 

space  of  polynomials  of  degree  q  or  less  defined  on  /„.  The  global  projeetion  operator 
TT  is  defined  by  setting  tt  =  7r„  on 

Definition  2.1.  The  discontinuous  Galerkin  dG(q)  approximation  U  6  W 
satisfies  Uq  =  Puq  and 

f"  ((f/,u) +  (£Vf/,Vt;))dt+ ([!/]„_,, t;+)  =  f"  {f{U),v)dt 

J£n-1  Jtn-1 

forallv^W^,  l<n<N.  (2.1) 


We  also  use  a  related  method  for  solving  the  adjoint  problem: 

Definition  2.2.  The  continuous  Galerkin  cG(q)  approximation  U  6  W’’ 
satisfies  Uq  =  Puq  and 

f"  ((C/,u)  +  (eVt/,Vn))dt=  T"  {IiU),v)dt 

Jt-n-l  Jin-l 

for  all  V  6  ^  ^  n  <  N, 

(2.2) 

Note  that  U  is  continuous  across  time  nodes  when  the  space  mesh  is  fixed. 

With  appropriate  use  of  quadrature  to  evaluate  the  integrals  in  the  variational 
formulation,  these  Galerkin  methods  yield  standard  differenee  schemes.  If  the  lumped 
mass  quadrature  Is  used  in  space,  then  the  discrete  system  yielding  the  dG(0)  approxi¬ 
mation  is  the  same  as  the  system  obtained  for  the  nodal  values  of  the  “backward  Euler 
ill  time" -“second  order  centered  difference  scheme  in  space”  finite  differenee  scheme. 
Likewise,  the  cG(l)  method  is  related  to  the  Crank-Nicolson  sehemc,  and  the  dG(l) 
method  is  related  to  the  third  order  siib-diagoiiaJ  Fade  difference  sehemc.  Under  gen¬ 
eral  assumptions,  the  cG(q)  and  dG(q)  have  order  of  aeeuraey  q  -I- 1  in  time  at  any 
point  and  a  superconvcrgencc  order  of  2g  +  1  and  2q  respectively  at  time  nodes. 

2.2.  An  a  posteriori  error  estimate.  We  begin  by  defining  a  suitable  adjoint 
problem  for  error  analysis.  A  more  detailed  deseription  is  given  in  [15].  The  adjoint 
problem  is  a  parabolic  problem  with  coefficients  obtained  by  linearization  around  an 
average  of  the  true  and  approximate  solutions. 

/  =  /(u,  U)  =  i^(us  +  1/(1  -  5))  ds.  (2.3) 

The  regularity  of  u  and  U  typieally  imply  that  /  is  pieeewise  continuous  with  respect 
to  t  and  a  continuous,  //'  function  in  space. 

Written  out  pointwise  for  convenience,  the  adjoint  problem  to  (1.1)  for  the  gen¬ 
eralized  Green’s  function  associated  to  the  data  which  determines  the  quantity  of 
interest. 
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is 

-  V  •  (£V0)  -  f4>  =  ’'!>,  {x,  <)  e  n  X  (T,  0], 

0(x,t)=O,  (a;.<)e5nx(T,0],  (2.4) 

4>{x,T)  =  0,  X  e  ft, 

This  choice  for  the  adjoint  yields  the  following  error  representation  formula  for 
the  dG  method. 

Theorem  2.3.  dG  A  Posteriori  Error  Estimate 


f  (e,ip)dt  =  {{1  -  P)uo,tji{Q))  +  '^{[U]n-i,{TTP<j)  - 

n=l 

+  /  {{U,nP<j)-4>)  +  {eiU)VUy{nP4>-<i>))-ifiU),7:P4,-,j>))dt.  (2.5) 
Jo 

The  initial  error  is  e“(0)  =  (7  —  P)uo. 

In  practice,  we  compute  a  numerical  solution  of  the  linear  adjoint  problem  ob¬ 
tained  from  (2.4)  by  replacing  u  with  the  computed  approximate  solution  U  in  the 
definition  of  f  and  solve  using  a  higher  order  method  in  space  and  time,  see  [15].  We 
denote  the  approximate  adjoint  solution  by  $.  We  focus  on  the  dG  method,  while 
application  to  the  cG  method  is  analogous. 

Corollary  2.4.  The  approximate  a  posteriori  error  estimate  for  the  dG  method 
is 


a  E(U)  =  E(U;tl))  = 


((/-P)uo,$(0))  +  X^([(/j„., 


n»I 


(,rP$-$)t,) 


+  /^((f/,7rP4>  -  4>)  +  (€([/) V[/,V(7rP$  -  #))  -  (/(f/),7rP§  -  $))  dt 

Jo 


(2.6) 


2.3.  Blockwise  discretization.  We  describe  the  blockwise  formulation  of  the 
discontinuous  Galerkin  method.  We  partition  [0, 2’]  into  time  blocks  0  =  To  <  < 

T-i  <  ■  ■  ■  <  Tb  <  ■  ■  ■  <  Tb  =  T.  We  discretize  each  block  (r6_i , 2),]  by  Tb~\  =  4,o  < 
4,1  <  ■  ■  ■  <  4,Art  =  Tb,  denoting  each  subinterval  by  Ib,n  :=  (4,n-ii  4,n]  mid  time  step 
by  kb^n  =  4,ti  -  4,n-i.  To  each  block  [Tb-i,Tb],  we  assoeiate  a  discretization  %  of 
ft  arranged  so  the  union  of  the  elements  iii  T),  is  ft  while  the  intersection  of  any  two 
elements  is  either  a  common  edge,  node,  or  is  empty.  We  assume  that  the  smallest 
angle  of  any  element  is  bounded  below  by  a  fixed  constant.  To  measure  the  size  of 
the  elements  of  %,  we  use  the  mesh  function  /ij,. 

The  approximations  are  polynomials  in  time  and  piecewise  polynomials  in  space 
on  each  space-time  “slab”  S(,,„  =  ft  x  In  space,  we  let  V),  C  Po(ft)  denote  the 
space  of  piecewise  linear  continuous  functions  defined  on  T^,  where  each  function  is 
zero  on  9ft.  Then  on  each  slab,  we  define 


=  \  0  :  0  =  '^b.j  e  V4,  (x,  t)  e  Sb,„ 

[  3=0 

Finally,  we  let  W’  denote  the  space  of  functioas  defined  on  the  space-time  domain 
ft  X  [0,  T]  such  that  „  G  ^bn  ^  1-  Note  that  functions  in  W’  may  be 
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discontinuous  across  the  discrete  time  levels  and  wc  denote  the  jump  across  4,n  by 

=  K,n  -  K,n- 

To  compute  the  dG  approximation  on  the  new  block,  we  project  the  final  value  of 
the  approximation  from  the  previous  block  onto  the  new  mesh.  We  use  a  projeetion 
operator  P^v  Q  Vj,  and  a  projection  operator  into  the  piecewise  polynomial  functions 
in  time,  denoted  by  7rf,,„  •.  (/(,,„)  — >  V'‘{Ib,n)-  We  then  define  nb  as  ni,  =  on 

Sb,n-  Finally,  wc  define  global  projections  P  and  tt  which  on  each  block  arc  Pb  and 
Wb  respectively. 

Definition  2.5.  The  blockwise  discontinuous  Galerkin  dG(q)  approximation  U  e 
VF'i  satisfies  =  PjUq  and  for  6  =  1, 2,  •  •  •  ,  B, 

{f{U),v)dt 

for  allv&  IV«„,  l<n<Nb.  (2.7) 


2.4.  A  blockwise  a  posteriori  error  estimate.  Adapting  the  standard  argu¬ 
ment  that  yields  (2.5),  wc  obtain  a  blockwise  a  posteriori  error  estimate. 

Theorem  2.6.  Blockwise  A  Posteriori  Error  Estimate 

T  B 

{e,tp)dt  ^  ((/  -  Po)uo,$(0))  +  ^((/  -  Pb)UMTb-i)) 

5=1 

B  f  fTt 

+  /  ((t>,7rP5'l>-#)-t-(e(L/)VL/,V(7rFj,$-$))-(/(C/),7rP64>-<t>))dt 

Ni,  . 

n— =1 

The  second  term  on  the  right  measures  the  effects  of  changing  meshes  on  the  accuracy 
of  the  approximation.  A  similar  “jump”  term  already  appears  in  the  estimate  for  the 
standard  dG  method  at  each  time  step.  In  this  ease  of  transitions  between  blocks,  the 
“jump”  arises  because  of  mesh  changes  between  blocks.  Note  that  the  adjoint  weight 
does  not  involve  the  projection  of  $  into  the  approximation  space  (i.e.  Galerkin 
orthogonality).  Instead,  the  contributions  from  the  projections  accumulate  in  the 
same  way  as  an  initial  error. 

Our  purpose  is  to  use  the  a  posteriori  bounds  JEx,JQtto  choose  block  times  {Tb} 
and  corresponding  meshes  %  and  timesteps  An  important  issue  is  the  effect  of 
transferring  solutions  between  the  meshes  of  adjacent  blocks  on  the  accuracy  of  the 
computed  information,  and  so  wc  address  the  computation  of  a  bound  on  the  second 
term  on  the  right  in  (2.8), 

=(^)  =  EI((^- w<f(n-i))|.  (2.9) 

5=1 

3.  Adaptive  error  control.  We  start  off  by  describing  some  standard  ap¬ 
proaches  to  adaptive  error  control  and  the  relation  to  adaptive  error  eontrol  based  on  a 
posteriori  error  estimates.  We  then  turn  to  the  problem  of  choosing  blocks  for  a  block 
discretization  and  generating  the  corresponding  spatial  and  temporal  discretizations 
for  each  block. 
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3.1.  Goal  oriented  adaptive  error  control.  The  aim  of  goal  oriented  adap¬ 
tive  error  eontrol  is  to  generate  a  mesh  with  a  nearly  minimal  number  of  elements 
sueh  that  for  a  given  tolcranee  TOL  and  data  ip, 

<  TOL.  (3.1) 

We  note  that  (3.1)  caimot  be  verified  in  praetiee  beeause  the  error  is  unknown,  so 
instead  we  use  an  estimate  or  a  bound  for  the  error  in  the  quantity  of  interest.  Different 
ways  to  generate  an  aeecptable  mesh  vary  by  the  estimate  or  bound  used  for  the 
quantity  of  interest  as  well  as  the  strategy  for  mesh  refinement. 

For  example  using  the  a  posteriori  estimate  (2.6),  the  goal  of  adaptive  error  eontrol 
is  to  determine  a  diseretization  so  that  a  mesh  aeeeptanec  eriterion, 

E{U)  <  TOL,  (3.2) 

is  .satisfied.  If  (3.2)  is  not  satisfied,  then  we  refine  the  mesh  in  order  to  eompute  a  new 
solution  for  which  the  criterion  is  met.  Refinement  decisions  require  identifying  the 
contributions  to  the  error  from  discretization  on  each  element.  We  can  write  E{U)  as 
a  sum  over  space-time  elements. 


E{U)  = 


N 


^  ((/  -  P)ue,  m)A  +  E  E 


Aer 


n=l  AST 


+E  E  /  ((f>,7rP<l>-$)A  +  (e(t/)Vf/,V(7rF$-$))A-(/(t/),7rP$-$)A)df  , 

n=l  AeT"'*"-  ‘ 


where  (  ,  )a  denotes  the  inner  product  on  element  A.  This  clearly  identifies 
possible  clement  contributions. 

However,  a  major  difficulty  i.s  that  the  error  estimate  generally  involves  a  large 
amount  of  cancelation  among  the  element  contributions,  which  makes  determining  a 
truly  efficient  refinement  strategy  extremely  difficult. 


Fig.  3.1.  The  element  contributions  to  the  error  in  integration. 


Example  3.1.  We  consider  a  first  order  aerurate  numerical  solution  that  has  the 
element  contributions  shown  in  Fig.  3.1. 

The  first  time  step  has  the  largest  contribution.  The  next  three  steps  each  con¬ 
tribute  -0.033,  so  cancelation  means  that  the  total  contribution  from  the  first  four 
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steps  is  0.001.  Likewise,  the  next  six  steps  contribute  +0.003  in  total.  The  last  four 
steps  contribute  0.08  in  total.  The  total  error  is  therefore 

.1  -  3  X  .033  +  .011  -  .01  +  .011  -  .01  +  .011  -  .01  +  4  X  .02  =  0.084 

If  we  use  a  standard  approach  of  refining  only  some  fraction  of  the  elements  with  the 
largest  contributions,  we  are  likely  to  refine  the  first  four  steps.  For  simplicity,  we 
assume  that  the  elements  marked  for  refinement  are  divided  into  two  time  steps.  The 
resulting  integration  will  have  accuracy 

^x2x.l-^x6x  .033  +  .011  -  .01  +  .011  -  .01  +  .011  -  .01  +  4  x  .02  0.0835. 

Note  that  the  individual  clement  contributions  decrease  at  a  second  order  rate.  The 
problem  is  that  even  though  the  element  contributions  in  the  first  four  steps  are 
individually  large,  there  is  significant  cancelation  and  refinement  in  this  region  and 
refinement  does  not  decrease  the  error  significantly.  On  the  other  hand,  if  we  refine 
the  last  four  time  steps  instead,  wc  obtain 

.1  -  3  X  .033  +  on  -  .01  +  .011  -  .01  +  .011  -  .01  +  ^  X  8  X  .02  ==  0.044 

While  this  is  a  non-standard  approach,  it  decreases  the  error  significantly. 

In  the  adjoint-weight  approach,  the  issue  of  cancelation  of  error  is  neglected  in  a 
sense  by  replacing  the  accurate  error  estimate  E{if)  by  an  inaccurate  upper  bound, 

E{i/)<EiU)  =  E{U;ii),  (3.3) 

where  we  define  E(lJ\4i)  by  summing  bounds  over  each  element. 

Definition  3.2.  Element-wise  upper  bound  on  the  total  error 


|((/-P)uo.$(0))a|  +  £  E 


Aer 


n=i  Aer 


A  rU 

+J2J2  /  (t>.JrP$-$)A  +  (£(t/)VC/,V(7rF$-$))A-(/(t/),7rF4>-4.)^dt 

n=l  A6T  •'‘"-1 


Thus,  if  (3.2)  is  not  satisfied,  the  mesh  is  refined  in  order  to  achieve 

iE(t/)<TOL.  (3.4) 

The  adaptive  error  control  problem  can  now  be  profitably  posed  as  a  constrained 
minimization  problem,  namely  to  find  a  mesh  with  a  minimal  number  of  degrees  of 
freedom  on  which  the  approximation  satisfies  the  bound  (3.4).  Using  the  fact  that 
the  bound  jE  is  a  sum  of  positive  terms  and  assuming  the  solution  is  eisymptotically 
accurate,  a  calculus  of  variations  argument  yields  the  generic  (see  c.g.  [9,  10,  3,  2]). 

Principle  of  Equidistribution  An  approximate  solution  of  the 
constrained  optimization  problem  for  an  optimal  mesh  for  an  upper 
bound  on  the  error  is  achieved  when  the  elements  contributions  to 
the  bound  are  approximately  equal. 


blockWise  adaptivity 


11 


The  Principle  of  Equidistribution  has  been  used  in  various  forms  at  least  since  the 
seventies  (and  probably  earlier  in  industry).  However,  experience  with  a  wide  range 
of  problems  suggest  that  the  bound  M  (U)  is  generieally  several  orders  of  magnitude 
larger  than  the  estimate  E{U).  A  strategy  based  on  the  Principle  of  Equidistribution 
that  optimizes  computational  cost  with  respect  to  a  error  hound  and  not  the  actual 
error  can  therefore  result  in  significant  over-refinement. 

In  general,  there  are  many  solutions  of  the  constrained  minimization  problem 
associated  with  (3.4).  An  adaptive  mesh  algorithm  is  a  procedure  for  eomputing  an 
acceptable  solution.  lYaditionally,  different  approaches  are  used  for  spatial  and  tem¬ 
poral  adaption.  A  global  “eompute-estimate-mark-adapt”  algorithm  (see  for  example 
1.1)  is  typieally  used  for  spatial  meshes.  This  is  an  iterative  approach  in  which  only 
some  fraction  of  the  elements  on  which  the  contribution  to  the  error  bound  is  largest 
are  refined  during  each  iteration  and  whole  cycle  is  iterated  until  a  prescribed  tol¬ 
erance  is  achieved.  Temporal  approaches  to  mesh  adaption,  c.g.,  local  error  control 
[21],  tend  to  use  a  “sweeping"  strategy  from  initial  to  final  time,  where  a  solution 
is  advanced  past  each  time  step  only  when  the  step  contribution  is  estimated  to  bo 
lower  than  an  acceptable  fraction  of  the  total  error.  This  may  be  viewed  as  a  gener¬ 
ally  pessimistic  way  to  achieve  the  Prinoiple  of  Equidistribution  because  it  removes 
positive  effects  of  cancelation  of  error  altogether.  As  a  consequence  of  these  differ¬ 
ences,  element  contributions  to  the  error  estimate  or  bound  typically  vary  in  size  quite 
considerably  while  contributions  from  different  time  intervals  are  more  nearly  equal. 

We  use  a  strategy  that  treats  space  and  time  discretizations  more  equitably.  In 
the  case  of  a  parabolic  problem,  it  is  straightforward  to  distinguish  the  time  and  space 
contributions  to  the  bound  dS .  We  define  the  time  and  space  bounds  as  follows. 

Definition  3.3.  Element-wise  temporal  and  spatial  error  bounds 


n=l  A€r  ' 

^  I  /■'" 

-^EE  /  (f/,  (n  -  /)F$)a  +  (e(t/)VI/,  V(7r  -  /)F$)a 

n=l  A€T' 


-(/(t/),(7r-/)F$)Adt  , 


(3.5) 


E  l((/-n«o,^(0))A|  +  E 


Aer 


Ti=i  Aer 


N 

+  EE  /  {U,  F$  -  $)a  +  (£(F)Vt/,  V(F$  -  $))a 

n=iAer 


-{f{U),P<P-^)^dt 


.  (3.6) 


We  see  that  the  time  bound  is  precisely  the  a  posteriori  bound  for  the  dG  approxi¬ 
mation  for  the  “method  of  lines”  initial  value  problem  resulting  after  discretization  in 
space.  The  adjoint  weight  depends  on  the  projection  of  the  adjoint  solution  into  the 
time  finite  element  space.  On  the  other  hand,  the  adjoint  weight  in  the  space  bound 
depends  on  the  projection  of  the  adjoint  solution  into  the  spatial  finite  element  space. 
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We  split  the  error  between  the  time  and  space  contributions  and  refine  the  current 
mesh  in  order  to  achieve 

TOT  TOr 

JEAU)<^^ndMdU)<^.  (3.7) 

On  a  given  time  interval,  this  requires  an  iteration  during  which  both  the  space  mesh 
and  time  steps  are  refined. 

3.2.  Goal  oriented  block  adaptive  error  control.  For  the  purpose  of  de¬ 
veloping  a  block  adaptive  algorithm,  wc  treat  adaptivity  with  respect  to  space  and 
time  in  the  same  way.  The  reason  is  that  we  determine  the  blocks  by  predicting  the 
local  element  sizes  (or  number  of  sub-elements)  that  are  required  in  the  final  mesh. 
We  create  a  block  by  grouping  together  a  set  of  coarse-scale  space-time  slabs  that  are 
adjacent  in  time  and  satisfy  some  criteria,  e.g.  similar  spatial  meshes  are  predicted 
for  the  space-time  slabs  in  the  block  or  a  maximal  number  of  elements  are  predicted 
to  be  required  in  the  block. 

3.2.1.  Choosing  a  global  tolerance  for  the  error  bound.  We  want  the  pre¬ 
dictions  of  the  element  sizes  required  in  an  acceptable  fine  scale  mesh  to  be  as  accurate 
as  possible.  Wc  recall  that  an  acceptable  mesh  need  only  satisfy  the  estimate  criterion 
(3.2)  and  not  the  more  stringent  bound  criterion  (3.4).  We  define  the  overestimation 
factor  for  a  given  mesh. 


E{U) 

^  “  E{U)  ’ 

and  the  corresponding  absolute  tolerance  for  JE , 


ATOL  =  7  X  TOL . 


We  replace  (3.4)  by 


ATOL 

2 


and  JEt{U)  < 


ATOL 

2 


(3.8) 


Note  that  ATOL  ss  TOL  when  there  is  little  cancelation  among  the  clement  con¬ 
tributions  and  ATOL  >  TOL  otherwise.  In  this  way,  we  attempt  to  mitigate  the 
inefficiency  that  is  introduced  by  replacing  an  accurate  error  estimate  by  an  inaccu¬ 
rate  bound  in  decisions  about  mesh  refinement.  This  approach  for  setting  tolerances 
is  discussed  further  in  [16]. 

3.2.2.  Predicting  refinement  in  space.  Given  a  local  space-time  element  6  = 
6(A,  n)  =  A  X  [tn-i ,  tn]  in  the  space-time  slab  that  is  marked  for  refinement,  we 
show  how  to  predict  the  number  of  spacc-time  elements  that  are  needed  to  meet  the 
acceptance  criterion.  We  assume  that  in  the  current  mesh,  there  are  N  time  steps  and 
M  space  elements  in  each  space-time  slab,  giving  a  total  of  NM  space-time  elements. 
We  define  a  local  absolute  tolerance 


LATOL  = 


ATOL 
2NM  ' 


By  the  Principle  of  Equidistribution,  we  adopt  the  goal  of  refining  each  space-time 
element  so  that  the  local  element  contribution  is  approximately  LATOL. 
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Using  a  priori  convergence  analysis,  sec  [15],  it  is  possible  to  show  that  there  is  a 
eonstant  C  sueh  that 


(3.9) 

as  /lA  0,  where  p  is  related  to  the  order  of  the  finite  element  method  in  spaee  and 
/lA  is  the  element  size.  Likewise,  we  ean  show  eonstant  C  sueh  that 

(3-10) 


as  k  — >  0,  where  q  is  related  to  the  order  of  the  finite  element  method  in  time. 

Now  suppose  that  an  element  ©now  in  the  final  mesh  is  obtained  from  ©oid  in  the 
eurrent  mesh  by  refinement.  We  have 


LATOL~yE„|g 


(3.11) 


This  yields  a  predietion  for  the  new  mesh  size. 


/lA„„* 


/  latolV'^*’ 


^  ^Aoid- 


(3.12) 


Recalling  that  d  is  the  space  dimension,  this  predicts  that  the  clement  Aow  should 
be  refined  into  roughly 


LATOL  j 


d/p 


sub-elements. 

3.2.3.  Predicting  refinement  in  time.  For  refinement  in  time, 

'  k 


I&t 


""  Vfcold 


:  LATOL. 


This  yields  a  prediction  for  the  new  mesh  size, 

(  LATOL  \ 


X  A:oid- 


This  predicts  that  the  time  step  fcoid  should  be  refined  into  roughly 

^new  y  LATOL  j 


(3.13) 


(3.14) 


(3.15) 


(3.16) 


sub-intervals. 
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3.2.4.  Determining  overlaps  for  meshes  on  adjacent  blocks.  After  the 
meshes  for  eaeh  bloek  are  determined  based  on  the  a  posteriori  prediction  of  error,  we 
need  to  estimate  the  effects  of  transferring  the  solution  between  meshes  on  adjacent 
blocks.  See  §  4.1  for  an  example  that  illustrates  this  point.  Recall  that  (2.9)  provides 
a  bound  on  these  effects.  The  difficulty  with  using  (2.9)  is  that  we  do  not  have  the 
fine  scale  numerical  solution  U  required  for  that  expression  until  after  solving  on  the 
fine  scale,  whereas  ideally  we  could  predict  a  reasonable  overlap  before  computing  the 
expensive  fine  scale  solution. 

We  list  three  strategies  for  mitigating  the  possibility  of  projeetion  error  in  our 
block  adaptive  framework. 

1.  There  is  a  very  simple  strategy.  In  forming  the  space  mesh  for  the  block 

X  fl,  we  guide  refinement  by  using  the  maximum  of  the  element 
contributions  on  each  individual  element,  taking  the  maximum  over  the  time 
intervals  ineludcd  in  the  block.  We  may  simply  include  the  maximum  over 
the  last  time  interval  included  in  the  previous  block,  [T5_2, i.c.,  over 
the  interval  We  can  be  even  more  conservative  by 

including  some  number  of  the  last  time  steps  in  the  maximum  computation. 

2.  We  can  use  gradient  recovery  [6]  to  compute  an  approximate  solution  on  the 
fine  scale  mesh  in  each  block  using  the  solution  from  the  last  time  interval 
contained  in  each  block.  We  can  then  directly  compute  (/  —  Pij)U  for  each  b 
and  evaluate  (2.9). 

3.  We  can  evaluate  (2.9)  a  posteriori  by  evaluating  (/  -  Pi)U  using  the  fine  scale 
forward  solution  and  the  coarse  scale  adjoint  solution. 

3.3.  Block  adaptive  algorithms.  Using  the  development  above,  we  present  a 
generic  block  adaptive  algorithm  in  Algorithm  3.1.  We  provide  a  detailed  algorithm 
in  Appendix  A. 


Algorithm  3.1  Block  Adaptive  Algorithm 
1;  Choose  the  “coarse”  mesh  and  time  step 
2:  Compute  the  coarse  scale  numerical  solution 

3:  Estimate  the  element  contributions  to  the  error  for  the  current  solution 
4:  Predict  the  number  of  space-time  elements  into  which  each  current  space-time 
element  is  to  be  divided  using  (3.13)  and  (3.16) 

5:  Build  block  discretizations  by  constructing  meshes  satisfying  the  requirements  for 
groups  of  neighboring  time  steps 

6:  Compute  the  fine  scale  numerical  solution  using  the  block  discretizations 


We  note  that  the  Block  Adaptive  Algorithm  3.1  can  be  iterated,  so  that  the  fine 
scale  becomes  the  new  coarse  scale,  and  a  new  fine  scale  is  subsequently  computed. 
In  crude  terms,  the  block  adaptive  Algorithm  3.1  is  analogous  to  the  core  estimate- 
mark- refine  algorithm  at  the  heart  of  the  generic  Algorithm  1.1,  but  is  different  in 
the  mark  and  refine  step.s.  The  critical  .step  defining  the  block  adaptive  algorithm 
Algorithm  3.1  is  the  strategy  used- to  create  block  discretizations.  Once  the  blocks 
are  identified,  we  can  \ise  any  adaptive,  me.sh  refinement  strategy  for  producing  the 
actual  meshes.  We  describe  several  strategies  for  determining  block  discretizations. 

3.3.1.  A  memory-bound  strategy.  In  the  first  strategy,  we  a.ssume  there  is 
a  target  number  of  elements  Nmax  in  space  that  is  maximal  in  some  sense,  e  g.,  the 
largest  number  of  elements  that  can  be  stored  in  core.  We  form  blocks  by  creating  a 
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union  of  adjacent  coarse-scale  space-time  slabs,  one  slab  at  a  time,  until  the  projected 
space  mesh  for  the  block  uses  Nmax  elements.  To  create  the  block  mesh,  we  use  the 
maximum  of  the  predicted  number  of  elements  Nelem.children  on  each  individual 
element  (given  by  equation  (3.13))  in  the  union  forming  the  block.  We  illustrate  in 
Fig.  3.2.  The  parameter  9  governs  how  often  the  mesh  is  replaced  by  a  coarser  mesh, 
where  0  10  works  well  in  practice. 


Division  into  Space-Time  Blocks  Predicted  Mesh  for  the  First  Block 


Space  Space 


Fig.  3.2.  The  memory  bound  stmtegy  is  used  for  a  traveling  pulse  that  moves  with  constant 
speed  from  left  to  right.  Left:  The  original  uniform  mesh  and  a  contour  plot  of  the  number  of 
predicted  elements  of  new  sub-elements  Nelem.children.  The  scale  is  from  dark  (low)  to  white 
(high).  Right:  The  predicted  number  of  new  sub-elements  Nelem.cbildren  for  the  first  block,  which 
consists  of  three  adjacent  space-time  slabs  from  the  original  discretization. 


3.3.2.  A  correlation  strategy.  In  the  second  strategy,  wc  aim  to  choose  blocks 
in  order  to  use  a  relatively  small  number  of  elements,  so  Nmax  may  be  considerably 
smaller  than  for  the  first  algorithiii.  This  strategy  forms  a  block  by  grouping  to¬ 
gether  adjacent  coarse-scale  space-time  slabs  whose  predicted  number  of  elements 
Nelem.children  are  close. 

Ill  [14],  we  consider  the  problem  of  detecting  significant  overlap  of  local  element 
contributions  for  different  computations.  Following  the  approach  there,  given  two 
vectors  v,  w  whose  coefficients  are  element  contributions  to  an  error  estimate,  we 
define  their  correlation  to  be  c{v,  w)  ■=■%)  -w.  Wc  say  that  v  is  significantly  correlated 
with  w  if 


c{v,  w) 

w 


>  7i  and 


<  72- 


where  0  <  71,72-  The  first  condition  insures  that  v  has  a  suitable  large  projection 
onto  w  while  the  second  condition  corrects  for  differences  in  scale  between  v  and  w 
(consider  ||i?||  2>  lltyjj  so  that  c{v,w)  2>  ||w]l). 

Wc  implement  the  new  criterion  for  creating  blocks  by  choosing  to  add  the  next 
time  slab  to  a  current  block  based  on  the  correlation  criterion. 

3.3.3.  Global  strategies.  In  the  first  two  strategies  for  creating  blocks,  we 
sweep  through  time.  We  can  also  use  a  bisection  search  beginning  with  the  original 
large  block  and  subdividing  to  find  acceptable  blocks.  In  analog  to  the  difference 
between  the  standard  global  strategy  for  space  mesh  refinement  to  achieve  the  Prin¬ 
ciple  of  Equidistribution  and  the  local-error  control  approach,  the  bisection  search  is 
a  global  strategy  that  can  be  a  more  efficient  way  to  achieve  equidistribution. 
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4.  Computational  Examples.  We  apply  the  block  adaptive  algorithms  to  sev¬ 
eral  prototypical  examples  in  one  and  three  space  dimensions.  The  one  dimensional  ex¬ 
amples  illustrate  several  key  points  when  implementing  block-adaptive  methods,  while 
the  three  dimensional  examples  include  a  traveling  wave  front,  a  solution  that  under¬ 
goes  time-  and  space-localized  perturbations,  and  a  periodic  motion  in  a  convection- 
dominated  flow. 

The  forward  problems  and  adjoint  problems  arc  solved  with  linear  and  quadratic 
elements  in  space  and  dGO  and  cGl  in  time  respectively.  The  one  dimensional  ex¬ 
amples  are  computed  using  the  Matlab  code  ACES  [25].  The  three  dimensional  ex¬ 
amples  arc  performed  on  a  hexahedral  mesh  using  a  trilinear  spatial  basis  for  the 
forward  problem  and  a  triquadratic  basis  for  the  adjoint.  Local  mesh  refinement  is 
accomplished  by  the  use  of  hanging  nodes  where  one  hanging  node  per  edge  or  face 
is  allowed.  Conformity  of  the  basis  is  obtained  by  interpolation  of  the  surrounding 
regular  nodes.  The  use  of  an  hierarchical  octree-based  data  structure  assists  refine¬ 
ment  but  also  allows  for  de- refinement  when  the  cleiiieiit  indicators  are  small.  For  the 
convection  driven  flow  problem,  SUPG  is  employed  for  both  the  forward  and  adjoint 
problems,  with  parameter 


(1/At  +  U/h)  ’ 

where  At  is  the  time  step  and  U  is  the  speed  of  the  convection  field  at  the  current 
time,  i.c.,  U  =  ||/3||2  in  (4.5).  This  is  not  an  obstacle  for  the  block-adaptive  frame¬ 
work,  as  we  simply  modify  the  theoretical  convergence  rate  p  in  the  computation  of 
Nelem.children  in  (3.13). 

4.1.  Example  One:  Projection  errors  between  blocks.  We  illustrate  the 
necessity  for  addressing  the  effect  of  transferring  solutions  between  space-time  blocks 
with  a  simple  one-dimensional  example  involving  a  traveling  wave. 

(lit  -  liii  =  /(x,  t),  0  <  X  <  1,  0  <  <, 

u(0,0  =  «(l,0  =^(0.  0<<.  (4T) 

u(x,  0)  =  tanh(Q(x  -  0.2)),  0  <  x  <  1, 

where  q  =  50  and  /  and  /3  are  chosen  to  give  an  exact  solution  u  =  tanh(Q(x- 1-0.2)). 
We  solve  with  a  coarse  mesh  using  h  =  0.1  and  time  step  k  =  0.05  from  initial  time 
0  to  final  time  0.6.  The  quantity  of  interest  is  the  average  space-time  error.  We 
compute  a  fine  scale  solution  using  two  blocks  derived  from  the  coarse  scale  solution. 
The  first  block,  t  =  [0,0.3],  uses  a  finer  spatial  rnesh  in  the  region  x  €  [0.1, 0.6],  while 
the  second  block  uses  a  fine  mesh  in  the  region  [0.5,  l],  so  the  overlap  is  minimal  and 
and  the  predictions  for  refinement  areas  arc  incorrect.  Consequently,  the  approximate 
traveling  wave  travels  too  quickly.  The  first  block  solution  at  f.  =  0.3  and  its  projection 
onto  the  second  block  at  t  =  0.3  is  displayed  in  Fig.  4.1. 

In  Fig.  4.1  we  illustrate  the  a  posteriori  use  of  (2.9)  to  correct  the  projection 
error.  Block  1  is  computed  using  the  predicted  fine  scale  mesh.  Block  2  is  tested  for 
significant  projection  error  using  (2.9)  using  the  fine  scale  solution  for  Block  1  and 
the  mesh  for  Block  2  is  refined  if  the  elementwise  projection  error  exceeds  LATOL. 
We  note  that  the  overlap  strategy  for  the  projection  error  in  §3.2.4  also  works  well  in 
this  particular  example. 

4.2.  Example  Two:  Coarse  scale  resolution.  Since  we  are  using  the  coarse 
scale  discretization  to  predict  the  global  behavior  of  the  solution  on  the  fine  scale, 
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Fig.  4.1.  Problem  (4.1).  The  circles  indicate  the  spatial  meshes  used  in  each  of  the  two  blocks. 
Left:  the  solution  on  Block  1.  Middle:  the  projection  of  the  approximate  solution  in  block  1  onto 
the  mesh  in  block  2.  Right:  the  solution  onto  Block  2  after  using  the  projection  error  estimate 
(2.9)  to  correct  significant  projection  errors  between  the  two  blocks.  This  demonstmtes  the  possible 
consequences  when  the  meshes  for  neighboring  blocks  do  not  overlap  sufficiently. 


it  is  important  to  insure  that  the  coarse  scale  discretization  is  not  too  coarse.  (This 
is  a  difference  between  the  block  adaptive  approach  and  a  standard  adaptive  mesh 
refineirieiit,  which  is  generally  started  with  a  very  coarse  mesh.)  This  issue  is  especially 
important  for  nonlinear  problems  since  linearization  is  used  to  define  the  adjoint 
problem,  which  in  turn  provides  the  means  to  quantify  the  effects  of  cancelation  and 
accumulation  of  errors. 

Consider  the  one-dimensional  nonlinear  parabolic  equation 

Ut  -  ^Uxx  =  Ctiu  -  1)(1  -  -1  <  X  <  1,  0  <  t  <  0.6, 

'  ii(0,f)  = -l,.u(l,f)  =  1,  0<f,  (4.2) 

ji(x,0)  =  tanh(Q(x  -  0.2)),  -1  <  a:  <  1, 

We  choose  Q  to  obtain  the  same  solution  as  the  example  in  §  4.1,  w  =  tanh(o(x  - 
t  -  0.2)).  The  quantity  of  interest  is  the  average  space-time  error.  For  the  coarse 
discretization,  we  use  h  =  0.05  and  k  —  0.05.  These  choices  provide  an  excellent 
cotirse  scale  discretization  for  the  linear  example  in  §  4.1  but  docs  not  work  well  for 
the  nonlinear  version.  We  show  two  snapshots  of  the  solution  u  in  Fig.  4.2  at  t  =  0.3 
and  I  =  0  6!  The  wave-speed  is  predicted  inaccurately,  which  leads  to  a  poor  block 
selection  and  this  subsequently  affects  the  fine  scale  accuracy.  Using  a  coarse  scale 
discretization  with  h  =  0.1  and  k  =  0.1  yields  inaccurate  results. 


Fig.  4.2.  Problem  (4-^).  Correlation  strategy  with  an  insufficiently  accurate  coarse-scale  solu¬ 
tion.  Solution  on  the  adapted  mesh  at  t  =  0.3  and  t  =  0.6  respectively. 


The  poor  predictions  based  on  the  coarsc-scale  discretization  can  be  avoided  by 
slightly  enriching  the  discretization  with  a  finer  time  step.  We  use  a  coarse  discretiza- 
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tioii  with  h  =  0.05  and  k  =  0.01  and  the  correlation  strategy  to  produce  blocks.  The 
approximate  solution  on  the  adapted  mesh  at  t  =  0.45  is  shown  in  Fig.  4.3. 


Fig.  4.3.  Problem  (4- S).  Correlation  strategy  with  an  improved  coarsc'Scale  solution.  Solution 
on  the  adapted  mesh  at  t  —  0.45  on  blocks  3  and  4  respectively. 


4.3.  Example  three:  A  traveling  wave  solution.  This  example  is  a  wave 
propagating  along  the  main  diagonal  of  the  unit  cube  (ft  =  [0, 1)  x  [0, 1]  x  [0, 1]).  The 
governing  equation  is 


I  Ut  -  Au  =  f{x,  t), 
u{x,  t)  =  0, 


X  6  ft,  0  <  t, 
X  6  3ft,  0  <  t, 


[?t(x,0)  =  (xi  -  Xi)(x2  —  X2)(x3  -  X3)arctan(^^\/xf  d-Xj  +X3),  x  6  ft, 
where  c  =  75  and  /  is  constructed  to  yield  the  exact  solution 


(4.3) 


u  =  ^  arctan(^^  Jx\+x\  +  x\  -  1). 

O  0  * 


The  coarse  block  solution  uc  is  constructed  on  an  8  x  8  x  8  uniform  mesh  using 
hcxahedral  meshes  with  an  initial  time  step  of  0.1.  The  quantity  of  interest  is  the 
time  average  of  the  solution  value.  The  memory  bound  strategy  is  used  to  construct 
the  discretization  blocks  with  ATOL  =  0.000178  and  Nmax=50000.  Block  information 
is  given  in  Table  4.1.  As  might  be  expected,  all  of  the  blocks  use  approximately  the 
same  number  of  elements.  We  show  contour  plots  of  the  solution  on  “slices”  of  some 
of  the  block  meshes  along  the  plane  x  =  0.5  in  Fig.  4.4. 


Block 

3  6-1 

wm 

#  vertices 

#  hexahedra 

1 

0 

0.4 

58711 

50394 

2 

0.6 

54503 

3 

MEM 

0.7 

61265 

4 

mtwm 

62626 

52368 

5 

1 

■Q&ai 

54860 

6 

1 

1.1 

54377 

Table  4.1 

Problem  (4 >3).  Blocks  resulting  from  the  memory  bound  strategy. 


4.4.  Example  Four:  Localized  forcing  in  space  and  time.  This  example 
contrasts  the  difference  in  the  blocks  produced  by  the  memory  bound  and  correlation 


BLOCKWISE  ADAPTIVITY 


19 


Fig.  4,4.  Problem  (4-3).  Memory  bound  strategy.  Slices  through  the  mesh  perpendicular  to  the 
x-asis  at  X  ~  0.5.  Upper  left:  i  =  0  (block  1).  Upper  right:  t  =  0.44  (block  2).  Lower  left:  t  —  0.6 
(block  3).  Lower  right:  t  =  1.1  (block  6). 


strategies  when  solving  an  equation  with  souree  terms  that  are  loealized  in  space  and 
time.  The  governing  equation  on  the  unit  eube  is 

|w(3;,  0)  =  0,  X  €  n, 

with  homogeneous  Neumann  boundary  eonditions  on  all  the  sides  exeept  the  bottom 
where  a  homogeneous  Diriehlet  eondition  is  imposed.  We  choose  qj  =  50,  =  10, 

=  1,  <2  =  10,  Xi  =  (0.125,0.125,0.125)  and  X2  =  (0.75,0.5,0.75).  The  quantity  of 
interest  is  the  time  average  of  the  solution  value. 

We  use  a  eoarse  diseretization  consisting  of  an  8  x  8  x  8  uniform  hcxahedral  mesh 
and  time  step  of  0.1.  With  ATOL  =  0.00010044  and  Nmax  =  50000  wo  show  the 
bloek  information  for  the  memory  bound  and  correlation  strategies  respootively  in 
Table  4.2  and  Table  4.3.  The  algorithms  lead  to  signifieantly  different  block  meshes. 
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Block 

Tb-i 

Tt 

#  vertices 

#  hexahedra 

1 

0 

1.1 

59465 

54125 

2 

1.1 

1.2 

63112 

57772 

3 

1.2 

2.4 

45359 

40958 

4 

2.4 

11.9 

12383 

10165 

5 

11.9 

14.9 

2029 

1478 

Table  4.2 

Problem  (4-4)-  Blocks  resulting  from  the  memory  bound  strategy. 


Block 

Tt-i 

#  vertices 

#  hexahedra 

1 

0 

1.1 

63112 

57772 

2 

1.1 

1.2 

63112 

57772 

3 

1.2 

1.6 

45359 

40958 

4 

1.6 

2.5 

9611 

8037 

5 

2.5 

2.9 

1968 

1436 

6 

2.9 

8.5 

966 

652 

7 

8.5 

9 

2617 

1926 

8 

9 

10.8 

12651 

10382 

9 

10.8 

11.3 

7363 

5860 

10 

11.3 

12.6 

3139 

2360 

11 

12.6 

14.9 

729 

512 

Table  4.3 

Problem  {4-4)-  Blocks  resulting  from  the  correlation  strategy. 


The  correlation  strategy  chooses  many  more  blocks,  but  many  of  the  blocks  have  very 
low  numbers  of  elements. 

We  show  planar  slices  near  Xi  and  X2  of  the  meshes  for  Blocks  1  and  3  in  Fig.  4.5. 
For  comparison,  we  show  planar  slices  perpendicular  to  the  x-axis  near  xj  and  X2  of 
the  meshes  for  bloeks  eonstructed  using  the  two  strategies  in  Fig.  4.6.  Both  strategies 
result  in  similar  meshes  near  X2  at  time  t  =  10.  However  at  f  =  8.8,  the  correlation 
strategy  leads  to  coarse  meshes  that  arc  not  produced  by  the  memory  bound  strategy. 
The  mesh  resulting  from  the  memory  bound  strategy  retains  the  refinement  resulting 
from  the  earlier  perturbation  near  xi  at  i  =  1. 

4.5.  Example  Five:  Periodic  motion  in  a  convection-dominated  flow. 
This  example  has  a  heat  source  with  a  forced  oscillating  convective  term  within  the 
unit  cube  Q  to  produce  an  “orbiting”  zone  of  perturbatioii.  The  governing  equation 
is 


{ut  +  P  ■  Vw  —  Au  =  /,  X  €  fl,0  <  t  <  1, 
u{x,t)=0,  X  G  dfl,0  <  t  <  I,  (4.5) 

u(x,0)  =0,  X  €  SI, 

with  p  =  (20(cos(7rt)sin(27rt),sin(7r/.)sin(27r<),cos(27rt))  and  /(x)  =  e-50(x(+i^+a:^) 
The  quantity  of  interest  is  the  time  average  value.  The  coarse  discretization  used 
4913  vertices  and  at  time  step  of  0.01.  The  blocks  constructed  by  the  memory-bound 
strategy  using  ATOL  =  0.00044  and  NmcLX=50000  arc  described  in  Table  4.4. 


2 


Fig.  4.6.  Problem  (4-4)-  Slices  through  the  mesh  perpendicular  to  the  x-axis.  Left:  Correlation 
strategy.  Slice  near  X2  at  t  =  10  (block  8).  Middle:  Correlation  strategy.  Slice  near  xi  at  t  =  8.8 
(block  7).  Right:  Memory  bound  strategy.  Slice  near  x\  att  —  8.8  (block  4)- 


22  V.  CAREY,  D.  ESTEP,  A.  JOHANSSON,  M.  LARSON,  AND  S.  TAVENER 


Block 

n 

Tb+i 

#  vertices 

#  hexahedra 

1 

0 

0.09 

58799 

51066 

2 

0.09 

0.15 

58424 

50289 

3 

0.15 

0.27 

58393 

50359 

4 

0.27 

0.61 

59102 

50744 

5 

0.61 

0.99 

28395 

23388 

Table  4.4 

Problem  (^.5).  Blocks  resulting  from  the  memory  bound  strategy. 


VVc  provide  “slices”  through  the  mesh  that  are  perpendicular  to  the  a:-axis  at 
X  =  0.5  for  four  representative  times  in  Fig.  4.7. 

5,  Conclusions.  In  this  paper,  we  consider  adaptive  algorithms  for  evolution 
problems  that  use  a  sequence  of  “blocks”  in  time  which  employ  fixed,  non-uniform 
space  meshes.  Blockwise  adaptive  algorithms  provide  a  way  to  balance  the  goal  of 
achieving  desired  accuracy  using  discretizations  with  relatively  few  degrees  of  freedom 
with  the  computational  overhead  associated  with  load  balancing,  re-meshing,  matrix 
reassembly  and  error  estimation.  Block  adaptive  algorithms  achieve  this  balance  by 
minimizing  the  number  of  mesh  changes.  However,  a  major  issue  is  determining  block 
discretizations  from  coarse  scale  solution  information  that  achieve  the  desired  accuracy 
and  efficiency.  We  describe  several  strategies  to  achieve  this  goal  using  adjoint-based 
a  posteriori  error  estimates.  We  demonstrate  the  behavior  of  the  proposed  algorithms 
as  well  as  several  technical  issues  in  a  set  of  examples. 
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Appendix  A.  Detailed  description  of  a  block  adaptive  algorithm. 


The  notation  used  in  our  block  adaptive  algorithm  is  as  follows. 


1.  Ntimestep  =  current  number  of  time  steps 

2.  Nelein(j)  =  number  of  space  elements  in  timestep  j,  i.e.,  for  I.  € 

3.  Ntiinestep.children(j)  =  number  of  subintervals  into  which  timestep  j  is 
to  be  divided 

4.  Nelein_children(i,  j)  =  number  of  subelements  into  which  finite  element  i  is 
to  be  divided  in  timestep  j 

5.  The  6th  “block”  is  time  interval  =  [4,o.4.Ni,] 

6.  The  6th  “block”  comprises  timesteps  . . . , i  e.,  Nt,  =  jb  -  jb-i,  4,o  = 

4h-i  ^^d  4,vv&  —  4*b* 

7.  block(i,b)  =  number  of  intervals  the  parent  element  i  will  be  divided  into 
on  block  6. 

8.  Nelem.block(b)  =  number  of  elements  in  block  b. 

9.  We  use  the  Matlab  colon  operator  :  to  denote  the  full  row  or  column. 

10.  The  parameter  6  governs  how  often  a  mesh  is  coarsened;  0  10  works  well. 
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Algorithm  A.l  A  memory-bound  strategy 
1:  Input  error  tolerance  TOL,  maximum  number  of  elements  in  any  block  Nmax,  the 
initial  coarsc-scale  discretization  for  the  forward  problem,  and  the  coarse-scalc 
discretization  for  the  adjoint  problems 
2:  Solve  forward  problem  (1,1)  for  U  on  [0,T] 

3:  Project  forward  solution  onto  coarsc-scale  adjoint  problem  mesh 
4:  Solve  adjoint  problem  (2.4)  on  coarse  scale  mesh  and  compute  E{U) 

5;  Compute  LATOL,  (3. 6), (3.5) 

6;  for  j  =  1, . . .  ,Ntimesteps  do 
7;  Compute  Ntimestep.children(j)  (3.13) 

8:  for  i  =  1, . . . ,  Nelem(j)  do 

9:  Compute  Nelem.childrenfi ,j)  (3.16) 

10:  end  for 

11:  end  for 

12:  Ntimesteps  Ntimestep.children(j) 

13:  Each  subinterval  of  inherits  Nelem_children(i, j) 

14:  6  =  1,  Tq  =  0,  Ti  =  /ci,  jo  =  1,  j  =  2 
15:  block(:,b)  <—  Nelem.children(:,  1) 

16:  Nelem.block(b)  <—  ^jblock(i,b) 

17:  while  Tb  <  T  do 

18:  while  Nelem_block(b)  <  Nmax  and 

19;  Nelem.block(b)  <  6  x  j) 

20:  jb  ^  j 

21:  'Eb  ^ —  Tb  -p  kj 

22:  block(:,b)  <— max[block(;, b), Nelem-Children(:,  j)j 

23:  Nelem-block(b)  =  block(i,b) 

24:  j  4-  j  -pi 

25:  end  while 

26:  6  4-  6  +  1 

27:  end  while 

28:  for  i  =  1, . . . ,  6  do 

29:  Compute  new  mesh  for  block  b 

30:  Optional:  Estimate  projection  error  and  correct  predicted  meshes  if  necessary 

31:  end  for 

32;  for  i  =  1, . . . ,  6  do 

33:  Solve  forward  problem  on  block  b  for  U 

34:  Project  U  onto  mesh  for  block  6  -P  1 

35:  Optional:  Compute  projection  error  between  blocks  and  correct  meshes 

36:  end  for 


To  implement  the  correlation-based  strategy,  we  alter  the  block  selection  criteria 
(^block(6)  <  Nmax)  with  a  step  which  accepts  a  block  if  block(:,6)  is  correlated  to 
Nelem-children(;,  j)  and  Nelem.block(b)  is  less  than  Nmax. 

The  algorithm  assumes  that  the  blocks  are  always  generated  (even  on  repeat  solve 
cycles)  using  the  coarse  mesh  as  a  base.  The  algorithm  may  be  easily  modified  to 
work  recursively  on  the  blocks.  It  may  also  be  modified,  with  a  little  more  care,  to 
allow  merging  and  splitting  of  blocks  during  repeated  solves. 
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ERROR  ESTIMATES  FOR  MULTISCALE  OPERATOR 
DECOMPOSITION  FOR  MULTIPHYSICS  MODELS 


1.1  Introduction 

Miiltiphysics,  miiltiscalc  models  that  couple  different  physical  processes  acting 
across  a  large  range  of  scales  are  encountered  in  virtually  all  scientific  and  engi¬ 
neering  applications.  Such  systems  present  significant  challenges  in  terms  of  com¬ 
puting  accurate  solutions  and  for  estimating  the  error  in  information  computed 
from  numerical  solutions.  In  this  chapter,  we  discuss  the  problem  of  computing 
accurate  error  estimates  for  one  of  the  most  common,  and  powerful,  numerical 
approaches  for  multiphysics,  multiscale  problems. 

1.1.1  Examples  of  multiphysics  models 

Without  any  attempt  to  be  complete,  we  describe  three  examples  of  niultiphysics 
models  that  illustrate  some  different  ways  in  which  physical  processes  may  be 
coupled. 

Example  1.1  A  thermal  actuator  A  thermal  actuator  is  a  MEMS  (micro¬ 
electronic  mechanical  switch)  device.  A  contact  rests  on  thin  braces  composed  of 
a  conducting  material.  When  a  current  is  passed  through  the  braces,  they  heat 
up  and  consequently  expand  to  close  the  contact,  see  Fig.  1.1.  The  actuator  is 
modeled  by  a  system  of  throe  coupled  equations,  each  representing  a  distinct 


Fig.  1.1.  Sketch  of  a  thermal  actuator. 

physical  process  acting  on  its  own  scale.  The  first  is  an  electrostatic  current 
equation 


V  •  (crVui)  =  0, 


(1.1) 
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governing  potential  ui  (where  the  current  is  J  =  —aVu\),  the  seeond  is  a  steady- 
state  energy  equation 

V  •  (k(u2)Vu2)  =  fCVtii  •  Vui).  (1.2) 

for  the  governing  temperature  U2,  and  a  linear  elasticity  equation  giving  the 
steady-state  displacement  U3. 

V-{\tr{E)I  +  2tlE-f3{U2-U2,r.f)l)=0:  E={S7u3+VuJ)/2.  (1.3) 

This  is  an  example  of  “parameter  passing"',  in  which  the  solution  of  one  com¬ 
ponent  is  used  to  compute  the  paraiiictcrs  and/or  data  for  another  component. 
Note  that  the  eleetric  potential  ui  can  be  calculated  independently  of  U2  and  U3. 
The  temperature  U2  can  be  calculated  once  the  electric  potential  ui  is  known, 
while  the  calculation  of  displaecincnt  U3  requires  prior  knowledge  of  U2,  and 
therefore  of  tii. 

Example  1.2  The  Brusselator  problem  First  introduced  by  Prigoginc  and 
Lefever  (Prigoginc  and  Lefever,  1968)  as  a  model  of  chemical  dynamics,  the' 
Brusselator  problem  consists  of  a  coupled  set  of  equations, 

—  ki  =  a  —  (0  +  l)ui  -I-  u^U2,  X  €  (0, 1),  f  >  0, 

^  -  h  - '^]u2,  i€(0,  l),f>0,  ^ 

iL\ (0,  t)  =  til  (li  t)  =  Q-.  ti2(0,  t)  =  W2(l!  t)  —  0/01.  f  >  0 
^tii(a:,0)  =Ui^o{x),  ti2(a:,0)  =  ti2,o(x),  x  €  (0,1), 

where  tii  and  U2  arc  the  concentrations  of  species  1  and  2,  respectively.  Solu¬ 
tions  of  the  Brusselator  problem  exhibit  a  wide  range  of  behavior  depending  on 
parameter  values. 

Reaction-diffusion  equations  arc  an  example  of  a  problem  that  combines  dif¬ 
ferent  physics  -  ill  this  case,  reaction  and  diffusion  -  in  one  equation.  The  generic 
picture  for  a  reaction-diffusion  equation  is  a  relatively  fast,  destabilizing  reaction 
component  interacting  with  a  relatively  slow,  stabilizing  diffusion  component. 
Thus,  the  physical  components  have  both  different  scales  and  different  stability 
properties. 

Example  1.3  Conjugate  heat  transfer  between  a  fluid  and  solid  object 
We  consider  the  flow  of  a  heat-conducting  Newtonian  fluid  past  a  solid  cylinder 


as  shown  in  Fig.  1.2. 

Ill  . 

1  u  Ltf  1 

Of  : 

rp 


Fig.  1.2.  Computational  domain  for  flow  past  a  cylinder 
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The  model  consists  of  the  heat  equation  in  the  solid  and  the  equations  gov¬ 
erning  the  conservation  of  momentum,  mass  and  energy  in  the  fluid,  where  we 
apply  the  Boussinesq  approximation  to  the  momentum  equations  in  the  fluid. 
The  temperature  field  is  advcctcd  by  the  fluid  and  couples  back  to  the  momen¬ 
tum  equations  through  the  buoyancy  term. 

Let  Qp  and  Q5  be  polj'gonal  domains  in  with  boundaries  dQp  and  dQs 
intersecting  along  an  interface  F/  =  dQsf^dQ.F.  The  complete  coupled  problem 
is 


'  -/iAii  +  po{uV)u  +  Vp  +  PupTpg  =  Po  (1  +  g, 
—V  ■  li  =  0, 

—kpATf  poCp{u  ■  VT/r )  =  Qf-, 

'  iTs=TF.. 

|A;/r(n  •  VTp)  =  ks{n  ■  '^Ts)., 

-ks/^Ts  =  Qs: 


X  G  np, 
X  e  Vif, 
X  6  Qf , 

X  e  F/, 
X  e  Qs  i 


(1.5) 


where  Po  and  To  arc  reference  values  for  the  density  and  temperature  respectively, 
p  is  the  molecular  viscosity,  p  is  the  cocfRcicnt  of  thermal  expansion,  Cj,  is  the 
specific  heat,  kp  and  ks  arc  the  thermal  conductivities  of  the  fluid  and  solid 
respectively,  Qf  and  Qs  arc  source  terms  and  n  is  the  unit  normal  vector  directed 
into  the  fluid.  Note  that  u  is  a  vector. 

We  define  Fu,o  and  F^^p  to  be  the  boundaries  on  which  we  apply  Dirichlct 
and  Neumann  conditions  for  the  velocity  field  respectively,  and  set 


!w  =  5u,Di  a:€Fu,£,, 
du 

P  9  t  ^  ^  lu.yV* 


Similarly,  we  define  F^^  d,  Ft^./v,  ^Ts,d-.  and  to  be  the  boundaries  on 

which  we  impose  Dirichlct  and  Neumann  conditions  for  the  temperature  fields 
in  the  fluid  and  the  solid  respectively,  and  set 


Tp  =  PTf-.D: 

kF{n  ■  VT>)  =  STf-.N'. 

Ts  =  9ts,d, 

^ks{n  ■  VTs)  =  prs.N; 


X  €  T'tp,d, 
X  €  T'tp,n, 
X  €  Fr5,D; 
X  €  Fts,/v. 


This  presents  a  class  of  problems  where  different  physics  in  different  physical 
domains  arc  coupled  through  interactions  across  a  common  boundary. 

1.1.2  Challenges  and  goals  of  multiscale,  multiphysics  models 

Multiscalc,  multiphysics  models  arc  characterized  by  intimate  interactions  be¬ 
tween  different  physics  across  a  wide  range  of  scales.  This  poses  challenges  for 
solving  such  problems,  c.g. 
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Accurate  and  efficient  computation  Computing  information  that  depends 
on  solution  behavior  occurring  at  very  different  seales  is  problematic.  It  is 
rarely  possible  to  simply  to  use  a  discretization  suffieicntly  fine  to  resolve 
the  finest  scale  behavior. 

Complex  stability  A  multiphysics  model  generally  offers  a  complex  stability 
picture  that  results  from  a  fusion  of  the  stability  properties  of  different 
physics.  For  example,  consider  a  reacting  fluid  that  combines  fluid  flow 
with  the  dynamical  properties  of  reaction-diffusion  equations. 

Linking  different  physics  across  scales  Understanding  the  significance  of  link¬ 
ages  between  physical  components  and  how  those  affect  model  output  is 
another  complicated  issue.  In  many  situations,  the  output  of  one  physi¬ 
cal  component  must  be  transformed  and/or  sealed  to  obtain  information 
relevant  to  the  other  components. 

Another  complication  is  the  range  of  applications  of  multiphysics  models. 
These  include 

Model  prediction  Perhaps  the  chief  goal  of  mathematical  modeling  is  to  pre¬ 
dict  the  behavior  of  the  modeled  system  outside  of  the  range  of  physical 
observation. 

Sensitivity  analysis  The  reliability  of  model  predictions  depends  on  analyzing 
the  effects  of  uncertainties  and  variation  in  the  physical  properties  of  the 
model  on  its  output. 

Parameter  optimization  In  design  problems,  the  goal  is  to  determine  opti¬ 
mal  parameter  values  with  respect  to  producing  a  desired  observation  or 
consequence. 

Such  applications  require  computation  of  solutions  corresponding  to  wide  range 
of  data  and  parameters.  We  expect  the  solution  behavior  to  vary  significantly 
and  the  ability  to  obtain  accurate  numerical  solutions  therefore  to  vary  as  well. 
This  raises  a  critical  need  for  quantification  and  control  of  numerical  error. 

The  .solution  and  application  of  multiphysics,  nmltiscalc  models  invoke  two 
computational  goals; 

•  Compute  specific  information  from  nmltiscalc,  multiphysics  models  accu¬ 
rately  and  efficiently 

•  Accurately  quantify  the  error  and  uncertainty  in  any  computed  information 

The  context  is  important: 

It  is  often  difficult  or  impossible  to  obtain  solutions  of  multiscale,  multiphysics 
models  that  are  uniformly  accurate  throughout  space  and/or  time 

Thus,  we  arc  interested  in  computing  accurate  error  estimates  for  solutions  that 
arc  relatively  inaccurate.  This  is  an  important  consideration,  given  that  much  of 
classical  error  analysis  is  derived  under  conditions  that  amount  to  assuming  that 
the  numerical  solution  is  in  the  “asymptotic  range  of  convergence” ,  meaning  that 
the  solution  is  sufficiently  accurate  that  the  rate  of  convergence  can  be  observed 
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by  uniform  refinement  of  the  discretization.  It  is  rarely  possible  to  reach  this 
level  of  discretization  in  a  multiphysics,  multiscalc  problem. 


1.1.3  Multiscale,  multidiscretization  operator  decomposition 
Multiscalc.  multidiscrctization  operator  decomposition  is  a  widely  used  tech¬ 
nique  for  solving  multiphysics,  multiscalc  models.  The  general  approach  is  to 
decompose  the  multiphysics  and/or  multiscalc  problem  into  components  involv¬ 
ing  simpler  physics  over  a  relatively  limited  range  of  scales,  and  then  to  seek  the 
solution  of  the  entire  system  through  some  sort  of  iterative  procedure  involv¬ 
ing  numerical  solutions  of  the  individual  components.  We  illustrate  in  Fig.  1.3. 
In  general,  different  components  arc  solved  with  different  numerical  methods  as 
well  as  with  different  scale  discretizations.  This  approach  is  appealing  because 
there  is  generally  a  good  understanding  of  how  to  solve  a  broad  spectrum  of 
single  physics  problems  accurately  and  efficiently,  and  because  it  provides  an 
alternative  to  accommodating  multiple  scales  in  one  discretization. 


Fully  Coupled  Multiscale  Operator  Decomposition  Solution 


Fig.  13.  Left;  Illustration  of  a  multiscalc,  multiphysics  model.  Right:  Illustra¬ 
tion  of  a  multiscalc  operator  decomposition  solution. 

Example  1.4  A  classic  example  of  multiscalc  operator  decomposition  is  oper¬ 
ator  splitting  for  a  rcaction-diffusibn  equation, 

^  dll 

—  =  e  Au  +  f{u),  a;  6  n,  0  <  t, 

'  suitable  boundary  conditions,  x€dQ.,Q<t, 

■u(-,0)  =  wo(-) 

where  fl  C  is  a  spatial  domain  and  /  is  a  smooth  function.  Accuracy  con¬ 
siderations  dictate  the  use  of  relatively  small  steps  to  integrate  a  fast  reaction 
component.  On  the  other  hand,  stability  considerations  over  moderate  to  long 
time  intervals  suggests  the  use  of  implicit,  dissipative  numerical  methods  for  in¬ 
tegrating  diffusion  problems.  Such  methods  arc  expensive  to  use  per  step,  but 
relatively  large  steps  can  be  used  on  a  purely  dissipative  problem.  If  the  reaction 
and  diffusion  components  arc  integrated  together,  then  the  small  steps  required 
for  accurate  resolution  of  the  reaction  lead  to  an  expensive  computation. 

In  a  multiscale  operator  splitting  approach,  the  reaction  and  diffusion  com¬ 
ponents  arc  integrated  independently  inside  each  time  interval  of  a  discretization 
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of  time  and  “synchronized’'  in  some  fashion  only  at  the  nodes  of  the  interval.  The 
rcaetion  eomponent  is  often  integrated  by  using  signifieantly  smaller  sub-steps 
(c.g  10“^  smaller  is  not  uncommon)  than  those  used  to  integrate  the  diffusion 
component,  which  can  lead  to  a  tremendous  computational  savings. 

Employing  the  method  of  lines,  if  we  discretize  in  space  using  a  continuous, 
piecewise  linear  finite  element  method  with  M  elements,  sec  Sec.  1.4.1,  we  obtain 
the  initial  value  problem:  find  y  6  such  that 

fy  =  Ay{t)  F(y(t)),  0<t<T, 

\yio)  =  yo, 

where  >4  is  an  /  x  Z  constant  matrix  representing  a  “diffusion  component"  and 
F{y)  =  (f|  (y),  F2[y),  ,  Fi{y))^  is  a  vector  of  nonlinear  functions  representing 

a  “reaction  component”. 

We  first  discretize  [0,T]  into  0  =  to  <  t\  <  t2  <  ■  ■  ■  <  =  T  with  diffusion 

time  steps  At,,  =  t,,  -  f„_i,  and  At  =  maxi<„<w(At„).  We  define  a 

piecewise  continuous  approximate  solution 

yit)  =  ^  in-l<t<tn..  (1.8) 

with  nodal  values  {y,,}  obtained  from  the  following  procedure: 


Algorithm  1  Abstract  Operator  Splitting  for  Reaction-Diffusion  Equations 
Set  yu  =  yo 
for  n  =  1,  •  •  ■  ,N  do 

Compute  y'  {t~)  satisfying  the  reaction  component 

f  y”  =  f{y''{t)),  tn-i  <  t  <  tn, 

\y''{^t-\)  =  Vn-x 

Compute  y'^{tn)  satisfying  the  diffusion  component 


yd  =  Ay‘'(t),  f„_i  <t  <t, 


(1.10) 


Set  y„  =  y‘'(t,.) 
end  for 


With  a  little  thought,  wo  recognize  that  this  algorithm  has  the  potential 
to  be  a  multiscale  solution  procedure  since  wo  can  now  resolve  the  solution  of 
each  component  on  independent  scales.  That  is  one  benefit  of  using  operator 
decomposition.  Unfortunately,  this  decomposition  has  unforscen  effects  on  both 
accuracy  and  stability.  The  reason  is  that  we  have  discretized  the  instantaneous 
interaction  between  the  reaction  and  diffusion  components. 
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Example  1.5  In  (Estep  et  al.,  2008a).  wc  consider  a  problem  in  which  the 
reaction  component  exhibits  finite  time  blow  up  when  undamped  by  the  diffusion 
component.  The  problem  is 


y  +  Xy  =  y‘^.  t>0, 

1/(0)  =  yo  €  R, 


(1.11) 


which  has  exact  solution 


2/(‘)  = 


A^u 

yo  -  {yo  -  A)  ’ 


(1.12) 


when  A  /  0.  The  exact  solution  exists  for  all  time  and  tends  to  zero  as  t  —>  oo 
when  A  >  1/0.  On  the  other  hand.,  there  is  finite  time  blow  up.  c.g.  i/  — ►  oo  at  a 
finite  time,  if  A  <  i/(j. 

Applying  the  operator  splitting  to  (1.11).  the  solutions  of  the  two  components 
and  the  operator  splitting  solution  arc.. 


y’'it)  = 


1  -  At,,  2/„_  I  ’ 


when  the  reaction  component  is  defined.  Wo  sec  that  decoupling  the  smoothing 
effect  provided  by  instantaneous  interaction  with  the  diffusion  component  means 
that  the  reaction  component  can  blow  up  in  finite  time.  This  has  an  effect  on 
numerical  solution. 

Wc  consider  the  time  steps  introduced  above,  {Af„}^_,,  to  be  diffusion  time 
steps.  For  each  diffusion  step,  wc  choose  a  (small)  time  step  As,,  =  At„/A/„ 
with  As  =  maxi<„<Ar  As,,,  and  the  nodes  t„_i  =  so,„  <  Si,„  <  •  •  •  <  Sa/„,„  = 
t„  (sec  Fig.  1.4).  Wc  associate  the  time  intervals  /„  =  and  = 

[s,n-i,,i,  with  these  discretizations.  Without  going  into  details,  wc  solve 


Fig.  1.4.  Discretization  of  time  used  for  inultiscalc  operator  splitting 


the  components  (1.9),  (1.10)  using  the  forward  and  backward  Euler  method 
respectively, 

y^7n,n  =  y^7n-X,n  +  fiy^7n-X.n)  =  V'",:-.  +  AY<‘:  At,,. 

Sec  Sec.  1.4.4  for  details  on  discretization  of  evolution  problems.  Wc  compute  a 
piecewise  linear  discrete  approximation  Y  using  the  nodal  values  of  Y'*. 
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On  the  left  side  of  Fig.  1.5,  we  plot  the  true  solution  and  the  nodal  values 
of  the  approximation  Y  computed  with  IV  =  50  diffusion  stops  and  M  =  1 
reaction  step  per  diffusion  step.  The  approximation  is  reasonably  accurate.  Next, 


Fig.  1.5.  Plots  of  the  appro.ximation  Y  and  the  true  solution.  Loft:  N  =  50, 
M  =  1.  Middle:  N  =  10,  M  =  5.  Right:  iV  =  5,  M  =  10.  The  nodal  values 
of  Y  are  denoted  by  the  larger  points  while  the  smaller  points  denote  node 
values  of  the  reaction  component  V. 


wo  show  the  rcsidts  when  the  diffusion  stop  is  increased  by  choosing  N  =  10 
and,  in  order  to  maintain  the  same  resolution  of  the  reaction  component,  we 
correspondingly  increase  to  M  =  5  reaction  steps  per  diffusion  stop.  The  node 
values  of  y  are  relatively  close  to  those  of  y.  The  subsequent  nodal  values  of 
the  reaction  component  solution  y’’  inside  each  step  move  away  from  the  true 
solution.  This  large  departure  is  somewhat  counteracted  by  airplication  of  the 
diffusion  operator.  The  reaction  components  exhibit  significant  growth  inside 
each  diffusion  step,  which  severely  affects  accuracy. 

If  we  increase  the  diffusion  step  by  taking  N  =  b  and  maintain  resolution  in 
the  reaction  component  by  taking  M  =  10,  the  approximation  becomes  even  less 
accurate.  If  we  increase  the  diffusion  step  further,  then  the  reaction  component 
actually  blows  up  inside  a  diffusion  step. 

We  emphasize  that  the  error  in  this  example  is  a  consequence  of  a  kind  of 
instability  introduced  by  multiscalc  operator  decomposition.  We  will  sec  below 
that  multiscalc  operator  decomposition  commonly  affects  both  accuracy  and 
stability  in  a  wide  variety  of  problems. 

1.2  The  key  is  stability.  But  what  is  stability  ...  and  stability  of 
what? 

Stability  is  likely  one  of  the  most  shopworn  terms  in  mathematics;  given  so  many 
meanings  so  as  to  cause  a  high  probability  of  mis-comnninication  in  any  mixed 
crowd.  Nonetheless,  stability  is  the  key  to  quantifying  the  effects  of  error  and 
uncertainty  on  the  output  of  a  computed  model  solution. 

Generally,  it  is  impossible  to  give  a  definitive  definition  of  stability  in  the 
context  of  a  raultiphysics  model.  A  computational  mathematician  might  think 
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in  terms  of  numerical  stability,  with  instability  characterized  by  oscillations  on 
the  scale  of  the  discretization.  A  mathematician  in  partial  differential  equations 
might  be  thinking  in  terms  of  wcll-poscdncss,  i.c.  continuous  dependence  on 
data.  A  physicist  might  be  worried  about  preserving  the  conservation  of  impor¬ 
tant  quantities  like  mass  and  energy.  A  dynainicist  might  think  in  terms  of  the 
stability  properties  of  stationary  solutions  and  attracting  manifolds. 

Indeed,  all  of  these  views  of  stability,  and  likely  others,  arc  important  in  the 
right  contexts.  In  fact,  the  only  definitive  thing  to  be  said  about  stability  is 
that  it  is  very  unlikely  that  just  one  view  of  stability  will  suffice  when  solving  a 
multiphysics,  multiscalc  problem. 

1.2.1  Pointwise  stability  of  the  Lorenz  problem 

To  illustrate  the  complexity  of  stability,  we  consider  the  infamous  Lorenz  prob¬ 
lem, 

ill  =  —  lOui  +  10u2, 

'  n2  =  2Sui  —  U2  —  UtUj,  0  <  t,  (L13) 

U3  =  -3^3  +U1U2. 

The  Lorenz  equations  were  derived  by  Lorenz  (Lorenz,  1963)  as  a  gross  simplifi¬ 
cation  of  a  weather  model  to  explain  why  weather  predictions  become  inaccurate, 
after  a  few  days.  VVe  have  chosen  parameter  values  believed  to  lead  to  chaotic  be¬ 
havior.  In  Fig.  1.6  we  plot  a  solution.  All  solutions  rapidly  approach  the  “strange 


50  1 


Fig.  1.6  Solution  of  the  Lorenz  problem  (1.13)  corresponding  to  initial  condi¬ 
tion  (-9.408,-9.096,28.581). 

attractor"  where  they  subsequently  remain.  The  dynamical  behavior  is  always 
the  same  in  qualitative  terms.  There  arc  two  non-zero  steady  state  solutions  and 
a  generic  solution  is  either  “orbiting"  one  of  these  solutions  or  transitioning  be¬ 
tween  orbits.  The  orbits  spiral  away  from  the  steady-state  solution  at  the  center 
until  a  point  when  a  solution  is  sufficiently  far  away  from  the  fixed  point,  where¬ 
upon  it  moves  to  orbit  around  the  other  fixed  point.  In  a  crude  way,  solutions 
behave  in  a  very  predictable  fashion. 
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Chaos  is  often  described  as  “sensitivity  to  initial  conditions" ,  which  means 
that  solutions  that  begin  close  by  to  each  other  eventually  move  apart.  In  Fig.  1.7, 
vve  plot  a  second  solution  to  the  Lorenz  problem  that  begins  near  the  solution 
plotted  in  Fig.  1.6  along  with  the  distance  between  the  solutions.  The  two  solu- 


Ftg.  1.7.  A  seeond  solution  of  the  Lorenz  problem  (1.13)  eorresponding  to  initial 
condition  (—9.408,-9.096001,28.581)  and  the  pointwise  differonee  between 
the  two  solutions. 

tions  start  close  together  and  aetually  remain  elose  until  around  time  17.5,  when 
there  is  a  rapid  increase  in  their  separation.  For  a  brief  period  between  18  and  21, 
the  separation  remains  fairly  constant,  and  then  it  begins  to  increase  again,  with 
another  very  rapid  increase  around  24.  All  solutions  must  remain  in  a  compact 
region  around  the  origin,  so  at  some  point  the  distance  between  the  solutions 
reaches  the  order  of  size  of  the  compact  region  and  cannot  grow  further. 

We  conclude  that  two  solutions  that  arc  pointwise  close  at  some  time  cventm 
ally  diverge  pointwise  at  a  later  time.  The  chaotic  nature  of  the  Lorenz  problem 
means  that  it  is  difficult  to  predict  the  pattern  of  orbits  around  the  fixed  points 
with  any  accuracy  very  far  into  the  future.  On  the  other  hand,  nearby  solu¬ 
tions  may  remain  close  for  quite  some  time  and  may  even  become  closer  before 
eventually  diverging. 

The  source  of  chaotic  behavior  in  the  Lorenz  problem  is  actually  rather  com¬ 
plex,  (Estep' and  Johnson,  1998).  However,  one  important  factor  is  relatively 
easy  to  explain.  Following  (Estep  and  Johnson,  1998),  in  Fig.  1.8,  we  show  a  plot 
looking  straight  down  the  vertical  axis  at  parts  of  many  solutions.  The  solutions 
shown  in  the  lower  left  corner  arc  orbiting  around  one  of  the  nonzero  fixed  points 
or,  if  they  arc  in  the  “outer”  orbit,  moving  to  the  neighborhood  of  the  other  fixed 
point.  Likewise,  the  solutions  plotted  in  the  upper  right  corner  arc  cither  orbiting 
around  a  fixed  point  or  moving  to  a  neighborhood  of  the  other  fixed  point. 

We  note  that  there  arc  two  solutions  in  the  lower  left-hand  region,  that  arc 
very  close  until  they  approach  the  vertical  U3  axis  but  then  rapidly  move  apart 
after  that.  In  fact,  there  is  a  separatix,  or  manifold,  coming  out  of  the  U3  axis 
that  separates  solutions  taking  another  orbit  around  a  fixed  point  from  those 


Fig.  1.8.  Left:  Looking  straight  down  the  U3  (vertical)  axis  at  many  solutions 
of  the  Lorenz  equations  (1.13).  Solutions  passing  through  a  neighborhood  of 
a  separatix  eoming  out  of  the  U3  axis,  shaded  in  the  figure,  are  very  sensitive 
to  small  perturbations  while  they  arc  in  that  neighborhood.  Right:  A  plot  of 
the  separatix. 

that  transition  to  the  other  fixed  point.  Solutions  on  either  side  of  this  manifold 
move  apart  rapidly.  Thus,  we  see  how  small  perturbations  can  lead  to  rapid 
separation.  Any  solution  that  passes  near  the  neighborhood  of  the  separatix, 
shaded  in  the  figure,  become  very  sensitive  to  small  perturbations  during  the 
short  time  the  solution  remains  in  the  neighborhood.  Eventually,  all  solutions 
pass  nearby  the  separatix  and  thus  become  sensitive  to  perturbation.  Away  from 
the  neighborhood  of  the  separatix,  the  distance  between  solutions  may  grow  or 
shrink  slowly,  e.g.  at  a  polynomial  rate.  This  explains  the  pattern  of  separation 
scon  in  the  plot  of  distance  between  two  solutions  in  Fig.  1.7. 

Not  surprisingly,  chaotic  behavior  affects  numerical  solutions  as  well.  In 
Fig.  1.9,  we  show  the  effects  of  varying  step  sizes  on  pointwisc  accuracy.  We 


Fig.  1 .9.  Two  numerical  approximations  of  the  Lorenz  solution  shown  in  Fig.  l.C 
arc  shown;  the  solution  on  the  left  is  accurate  while  the  solution  on  the  right 
is  computed  with  larger  step  sizes.  The  distance  between  the  solutions  is 
shown  on  the  right. 
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plot  the  difference  between  the  numerical  solutions  on  the  left  in  Fig.  1.10.  The 


Errors 

Decrease 


Eirors 

Increase 


Fig.  1.10.  Left:  Plot  of  the  pointwisc  difference  between  the  numcrical  solutioiis 
of  Lorenz  shown  in  Fig.  1.9.  Right:  A  blowup  of  the  difference  for  0  <  t  <  12. 


pointwisc  numerical  error  clearly  follow  an  increasing  trend,  but  it  docs  not 
increase  monotonically.  In  fact,  the  pointwisc  error  actually  decreases  during 
some  short  periods  of  time,  see  the  plot  on  the  right  in  Fig.  1.10. 

1.2.2  Classic  a  priori  stability  analysis 

But  what  about  the  classic  a  priori  stability  analysis  that  is  taught  in  courses  in 
differential  equations  and  numerical  analysis  (a  priori  means  that  it  is  carried  out 
before  any  solutions  arc  computed)?  Do  these  classic  notions  of  stability  present 
a  reasonable  picture  for  a  particular  solution?  In  fact,  the  answer  is  decidedly 
no  in  most  eases.  The  classic  view  of  stability  for  linear  problems  tends  to  be 
black  and  white;  a  particular  solution  is  either  stable  or  unstable  with  respect 
to  perturbations.  We  see  that  this  point  of  view  fails  for  solutions  of  simple 
nonlinear  problems,  c.g.  the  Lorenz  problem. 

Example  1.6  It  is  easy  to  illustrate  the  shortcomings  of  a  priori  stability  analy¬ 
sis  using  error  analysis  of  the  numerical  solution  of  a  matrix  system  of  equations. 
Consider  a  numerical  solution  Y  a;  j/  of  a  matrix  system 


Ay  =  b, 


(1.14) 


computed  using  Gaussian  elimination.  The  computable  residual  of  T  is  R  = 
Ay  —  b  and  the  classic  relative  error  bound  ((Higham,  2002))  is 


lly|| 


<  C.(A)M 

-  ^  ||61|  ’ 


(1.15) 


where  the  condition  number  k(A)  =  ||A|j  ||A“*j|  is  a  measure  of  the  sensitivity 
of  the  solution  of  (1.14)  to  perturbations  in  the  data,  i.c.  the  stability  properties 
of  the  inverse  operator  of  A.  In  particular. 


k{A)  = 


_ IIAH _ 

distance  from  A  to  {singular  matrices} 
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(To  be  precise,  we  have  to  specify  norms,  but  that  level  of  detail  is  not  important 
here.) 

We  now  solve  (1.14)  using  Gaussian  elimination,  where  A  is  a  random  800  x 
800  matrix,  where  the  coefficients  in  the  random  matrix  arc  uniformly  distributed 
f7(— 1,1)  on  (-1,1).  The  goal  is  to  determine  the  first  component  yi  of  the 
solution.  The  condition  number  of  A  is  6.7  x  10'*.  Straightforward  computation 
yields 


actual  error  in  the  quantity  of  interest  r;  1.0  x  10 
traditional  error  bound  for  the  error  «  3.5  x  10”^ 

We  see  that  the  traditional  error  bound  is  orders  of  magnitude  larger  than  the  ac¬ 
tual  error  and  is  essentially  useless  as  far  as  estimating  the  error  in  the  particular 
computed  information. 

We  remark  that  the  bound  (1.15)  is  a  specific  example  of  a  general  “meta- 
theorem’',  which  reads 

Theorem  1.7 

\\effect  of  perturbation  on  the  output  of  an  operator|| 

<  \\measure  of  stability  of  the  operator\\  x  Hsz'ze  of  the  peHurbation\\.  (1.16) 
In  the  linear  algebra  example,  we  have 

AY  =  b  +  R, 

hence  we  can  think  of  the  numerical  solution  Y  as  solving  the  linear  system 
(1.14)  with  perturbed  data  6-1-  /J. 

The  pessimism  of  a  classic  a  priori  stability  bound  is  not  surprising  given  a 
little  reflection.  After  all,  the  goal  of  such  a  bound  is  to  account  for  the  largest 
possible  error  in  a  large  class  of  solutions  corresponding  to  a  large  set  of  data,  not 
produce  an  accurate  error  estimate  for  particular  information  computed  from  a 
particular  solution.  The  power  of  an  a  priori  error  bound  is  that  it  characterizes 
the  general  behavior  of  the  numerical  method. 

The  situation  for  nonlinear  problems  is  worse.  For  example,  in  nonlinear 
evolution  problems,  the  classic  stability  analysis  uses  a  Gronwall  argument  to 
obtain  a  bound  in  the  form, 

effect  of  perturbation  at  time  t  <  C  x  size  of  perturbation, 

where  C  and  L  arc  constants  with  L  typically  large  (L  is  on  the  order  of  100  in 
the  Lorenz  example).  Such  bounds  arc  non-dcscriptive  past  a  very  short  initial 
transient,  c.g.  even  for  the  chaotic  Lorenz  problem.  The  factor  plays  the 
role  of  a  condition  number  (for  an  absolute  error  estimate). 
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1.2.3  Stability  for  stationary  problems 

There  is  a  long  tradition  of  conducting  careful,  precise  analysis  of  stability  for 
evolutionary  problems  and,  in  particular,  distinguishing  different  types  of  sta¬ 
bility  properties.  It  is  perhaps  fair  to  say  that  stability  for  stationary  problems 
tends  to  be  treated  more  erudely,  at  least  for  elliptic  problems.  This  is  unfor¬ 
tunate  because  stability  is  just  as  complex  an  issue  for  stationary  problems  as 
evolutionary  problems. 

Example  1.8  To  illustrate  this  claim,  we  consider  an  elliptic  problem 

{L{u)  =  —V  ■  ((.001  +  I  tanh(10(2/  +  l)|)Vu) 

-Q  X  ■  Vm  =  f{x,y)  =  10,.  (a:,r/)  e  11, 

u{x,y)  =  0,  (x,y)edQ, 

(1.17) 

where  a  =  0, 1  and  Q  =  (—2, 2|  x  j— 2, 2|.  This  diffusion  parameter  is  0(1)  except 
for  a  narrow  region  around  the  line  y  =  —1,  where  it  dips  rapidly  to  .001.  When 
0  =  0  there  is  no  convection. and  when  a  =  1,  there  is  strong  convection.  We 
plot  a  solution  with  no  convection  in  Fig.  1.11. 


Fig.  1.11.  Left:  A  plot  of  the  solution  of  (1.17)  with  a  =  0.  Right;  A  plot  of 
the  effect  of  the  perturbation  p  when  a  =  0. 

Wo  now  consider  the  effect  of  perturbing  the  data  /  to  /  +  p  by  a  very 
“pointed”  function 

which  is  nearly  zero  outside  of  a  small  neighborhood  of  (— 1,.5).  Because  the 
problem  (1.17)  is  linear,  we  can  compute  the  effects  directly.  Namely  with  L{u)  = 
/  and  U  denoting  the  perturbed  solution  L(U)  =  f  +  p.  we  can  compute  the 
perturbation  to  the  solution,  w  =  U  —  u,  directly  as  the  solution  of 

L{w)  =  L(U  -u)  =  L{U)  -  L{u)  =  p. 

In  Fig.  1.11,  wc  plot  the  perturbation  w  to  the  solutions  for  a  =  0  and  a  =  1. 
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Wc  observe  that  the  efFeet  of  the  perturbation  decreases  dramatically  to  zero 
sufficiently  far  from  the  region  where  the  perturbation  is  nonzero,  e.g.  close  to  the 
boundaries  of  a;  =  2  and  y  =  —2,  sec  Fig.  1.11.  This  kind  of  “decay  of  influence” 
is  characteristic  of  the  Greens  function  associated  with  Poisson’s  problem.  The 
situation  for  more  general  elliptic  problems  is  much  more  complex  however. 

For  example,  convection  has  a  strong  effect  on  the  way  in  which  the  pertur¬ 
bation  disturbs  the  solution.  In  Fig.  1.12,  wc  plot  the  solution  with  a  =  1.  Now, 
some  of  the  effects  of  the  perturbation  arc  felt  right  across  the  domain,  even  very 
near  the  boundary  at  j/  =  —  2.  Note  also  that  the  perturbation  docs  not  decay 
uniformly  in  all  directions. 


Fig.  1.12.  Left:  A  plot  of  the  solution  of  (1.17)  with  a  =  1.  Right:  A  plot  of 
the  effect  of  the  perturbation  p  when  cr  =  1.  Note  how  the  influence  “curves" 
because  of  the  singular  perturbation  in  the  diffusion. 


In  general,  the  “decay  of  influence”  of  the  effects  of  a  localized  perturbation  is  an 
important  stability  projjcrty  of  elliptic  problems.  It  has  strong  consequences  for 
devising  efficient  adaptive  mesh  refinement  for  example.  It  can  also  be  exploited 
to  devise  new  approaches  to  domain  decomposition,  sec  (Estep  et  al.,  2005). 
However,  the  decay  is  very  problem  dependent  and  exploiting  it  fully  requires 
numerical  solution  of  the  adjoint  problem  in  general. 

Wc  can  contrast  these  ideas  with  the  classic  analysis  of  elliptic  stability,  which 
typically  yields  a  result  of  the  form 


ll«^ll.  <  CIIpII.*, 


for  some  appropriate  norms  ||  ||,,  ||  ||,,,  where  p  belongs  in  some  reasonable  space 
of  fimctions  and  C  is  some  constant  independent  of  the  choice  of  particular  p  in 
this  space.  Wc  sec  that  such  a  result  docs  not  describe  the  decay  of  influence  in 
the  example  above.  As  with  the  linear  algebra  example,  an  o  priori  analysis  of 
stability  tends  to  be  much  too  pessimistic  when  applied  to  a  particular  solution. 
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1.2.4  The  meaning  of  stability  depends  on  the  information  to  be  computed 

In  the  examples  above,  we  concentrated  on  the  stability  of  pointwise  values.  But, 
we  must  broaden  the  point  of  view  here.  For  example,  a  little  reflection  suggests 
that  worrying  about  the  pointwise  behavior  of  Lorenz  solutions  is  not  very  well 
motivated  in  terms  of  physical  modeling.  In  whatever  sense  the  Lorenz  problem 
presents  a  model  of  weather,  it  is  certainly  not  a  pointwise  representation  of 
weather!  Rather,  it  is  the  qualitative  behavior  of  the  Lorenz  problem  that  is 
meant  to  represent  some  characteristic  of  weather  patterns.  It  is  more  reasonable 
to  consider  a  quantity  of  interest  computed  from  solutions  that  better  represents 
qualitative  behavior  of  all  solutions. 

This  is  an  important  observation  because  it  turns  out  that  the  effect  of  per¬ 
turbations  on  a  solution  depends  strongly  on  the  information  being  computed 
from  the  solution. 

Example  1.9  To  illustrate  this,  we  consider  the  average  of  the  instantaneous 
distance  from  a  solution  of  the  Lorenz  problem  to  the  origin,  see  Fig.  1.13.  The 
motivation  is  that  all  solutions  must  remain  in  a  neighborhood  of  the  origin.  We 
compare  results  for  numerical  solutions  with  a  coarse  time  step  .001  and  fine  time 
step  .0001,  and  see  that  the  distances  are  completely  different  after  a  moderate 
time. 


Time 

Fig.  1.13.  The  quantity  of  interest  is  the  average  of  the  instantaneous  distance 
from  the  solution  to  the  origin.  On  the  right,  wo  plot  the  average  instanta¬ 
neous  distance  for  solutions  with  time  stops  .001  and  .0001  respectively.  The 
distances  agree  during  an  initial  period  and  are  completely  different  after 
that. 

We  give  values  for  the  average  instantaneous  distance  along  with  the  variance 
over  three  intervals  in  Table  1.1.  The  accuracy  of  the  numerical  solution  appears 
to  have  little  effect  on  the  average  distance. 

In  order  to  verify  this  observation,  wo  compare  these  results  to  the  average 
distance  to  the  origin  computed  from  an  ensemble  of  100  accurate  solutions,  each 
computed  using  time  step  .0001  for  15  time  units  in  Table  1.2.  Initial  values  for 
these  solutions  were  drawn  at  random  from  values  of  the  long  time  solution  after 
T  =  50,  insuring  the  initial  values  arc  distributed  appropriately  on  the  strange 
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Coarse  Solution 

Fine  Solution 

End  Time 

Ave 

Var 

Ave 

Var 

27.6 

27.6 

51.9 

26.5 

26.5 

79.2 

320 

26.3 

26.3 

83.0 

Table  1.1.  Average  instantaneous  distance  of  numerical  Lorenz  solutions  to  the 
origin  computed  with  time  steps  .001  and  .0001. 


attractor.  Again,  there  is  close  agreement  in  values. 


Coarse  Solution 

Fine  Solution 

Ensemble  Average 

End  Time 

Ave  Var 

Ave  Var 

Ave  Var 

■  320 

26.3  83.7 

26.3  83.0 

Table  1.2.  Average  distance  of  numerical  Lorenz  solutions  to  the  origin  com¬ 
puted  with  over  a  long  time  and  using  an  ensemble  average. 


Example  1.10  A  characteristic  of  Lorenz  solutions  that  is  linked  to  chaotic 
behavior  is  the  pattern  of  orbits  around  the  nonzero  fixed  points.  VVe  compute 
the  average  number  of  orbits  around  a  particular  fixed  point  made  by  a  solu¬ 
tion  before  it  moves  to  orbit  the  other  fi.xcd  point,  see  Fig.  1.14.  We  compare 


1 

G  Ensemble  Average 

■  Long  Time  Compuiaiion 

1 

In 

1 

2 

3 

4  5  6  7  8  9  10 

Number  of  Orbits 


Fig.  1.14.  The  quantity  of  interest  is  the  average  number  of  orbits  around  a 
nonzero  fi.xcd  point.  On  the  right,  we  plot  the  probability  density  for  the 
orbits  computed  from  a  long  time  solution  and  an  ensemble  average  of  short 
time  accurate  solutions. 

the  probability  density  of  orbits  for  an  ensemble  average  over  many  short  time, 
accurate  solutions  to  that  for  a  long  time  solution. 

The  conclusion  is  that  solutions  of  the  chaotic  Lorenz  problem  arc  sensitive 
pointwisc  to  perturbations  and  errors,  yet  there  arc  quantities  of  interest  that  can 
be  computed  from  the  solutions  that  arc  relatively  insensitive  to  perturbation 
and  error.  An  analysis  to  obtain  accurate  estimates  of  the  effects  of  perturbation 
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and  error  must  be  conducted  relative  to  the  information  that  is  to  be  computed 
from  solutions. 

1.3  The  tools  for  quantifying  stability  properties:  Functionals, 
duality,  and  adjoint  operators 

In  the  previous  section,  we  saw  that  stability  plays  a  critical  role  in  determining 
the  effects  of  perturbation  and  error.  We  saw  that  stability  is  often  a  complc.x 
issue  with  many  facets  that  arc  not  easily  determined.  We  also  saw  that  classic 
a  priori  analyses  may  be  too  crude  to  be  used  to  quantify  variation  and  error 
arising  in  particular  information  computed  from  particular  solutions  of  particular 
problems  with  any  reasonable  accuracy. 

All  of  these  observations  provide  motivation  to  find  another  approach  to  de¬ 
termine  the  effects  of  stability.  The  approach  we  describe  in  this  chapter  is  a 
posteriori,  which  means  that  the  stability  of  particular  information  of  a  particu¬ 
lar  solution  is  determined  after  the  information  is  computed.  A  posteriori  and  a 
priori  analyses  arc  fundamentally  different.  For  example,  an  a  priori  error  anal¬ 
ysis  of  a  numerical  method  describes  the  general  accuracy  properties  for  a  wide 
class  of  solutions,  yet  generally  overestimates  the  error  in  any  particular  solution 
to  a  significant  extent.  An  a  posteriori  error  analysis  provides  an  estimate  of  the 
error  in  particular  computed  information,  but  the  estimate  changes  when  the  so¬ 
lution  changes  and  consequently  it  is  generally  difficult  to  draw  any  conclusions 
about  convergence  of  the  method.  Both  a  priori  and  a  posteriori  analyses  play 
key  roles  in  analyzing  the  effects  of  uncertainty  and  error. 

We  use  duality  and  adjoint  operators  to  quantify  stability  o  posteriori.  We 
combine  these  with  variational  analysis  to  produce  accurate  estimates  of  the 
effects  of  perturbation  and  error.  These  tools  have  a  long  history  of  applica¬ 
tion  in  model  sensitivity  analysis  and  optimization,  dating  back  to  Lagrange. 
We  present  a  very  brief  overview  here,  see  the  references  (Marchuk  et  al.,  1996; 
Lanezos,  1997;  Cacuci,  1997;  Marchuk,  1995;  Atkinson  and  Han,  2001;  Aubin, 
2000;  Cheney,  2000;  Folland,  1999;  Schcchtcr,  2002)  for  more  details.  The  appli¬ 
cation  of  these  tools  to  a  posteriori  error  estimation  has  a  more  recent  history,  see 
the  references  in  (Eriksson  et  al..  1996;  Eriksson  et  al.,  1995;  Estep  et  al.,  2000; 
Becker  and  Rannachcr,  2001;  Giles  and  Siili,  2002;  Bangerth  and  Rannachcr, 
2003;  Paraschivoiu  et  al.,  1997;  Barth,  2004). 

1.3.1  Funetionals  and  computing  information 

We  focus  on  computing  a  particular  piece  of  information,  or  a  quantity  of  interest, 
from  a  solution  of  a  model.  We  use  linear  functionals,  which  arc  a  special  kind 
of  linear  map,  to  do  this.  A  continuous  linear  functional  ^  is  a  continuous  linear 
map  from  a  vector  space  X  to  the  reals  M. 

Example  1.11  Let  v  in  R"  be  fixed.  The  map  ({x}  =  v  ■  x  =  (x.v)  is  a  linear 
functional  on  S'*. 

Example  1.12  Consider  C’([a,  6]).  Both  £{f)  =  f{x)dx  and  f{f)  =  f{y)  for 
a  <y  <b  arc  linear  functionals. 
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Example  1.13  There  arc  important  nonlinear  functionals,  c.g.  norms. 

It  is  useful  to  think  of  a  linear  functional  as  providing  a  “one  dimensional 
snapshot’’  of  a  vector. 

Example  1.14  In  Example  1.11,  consider  v  =  Ci,  the  standard  basis  func¬ 
tion.  Then  i{x)  =  x,  where  x  =  (xj,  ■  ■  ■  ,x„). 

Example  1.15  If  6y  denotes  the  delta  function  at  a  point  y  in  a  region  fl.  I’his 
gives  a  linear  functional  on  sufficiently  smooth,  real  valued  functions  via 

i{u)  =  u{y)  =  /  6y{x)u{x)dx. 

Jn 

Example  1.16  The  expected  value  E(X)  of  a  random  variable  X  is  a  linear 
functional. 

Example  1.17  The  Fourier  coefficients  of  a  continuous  function  /  on  [0,  27t], 

2s 

f{x)  dx 

Using  linear  functionals  means  settling  for  a  set  of  snapshots  rather  than 
the  entire  solution.  Presumably,  it  is  easier  to  compute  accurate  snapshots  than 
solutions  that  arc  accurate  everywhere.  In  many  situations,  we  settle  for  an 
“incomplete”  set  of  samples. 

Example  1.18  Wc  arc  often  happy  with  a  small  set  of  moments  of  a  random 
variable. 

Example  1.19  In  applications  of  Fourier  series,  we  typieally  use  a  finite  sum 
truneation  of  the  infinite  series.  VVe  require  increasing  amounts  of  information, 
e.g.  values,  of  a  function  in  order  to  compute  increasingly  higher  order  Fourier 
coefficients. 

VVe  define  the  dual  space  to  be  the  colleetion  of  “reasonable”  snapshots.  More 
precisely,  if  X  is  a  normed  vector  space  wuth  norm  ||  |1,  the  space  of  continuous 
linear  functionals  on  X  is  called  the  dual  space  of  or  on  or  to  X.  and  is  denoted 
by  X* .  The  dual  space  is  a  vector  space.  Wc  can  define  the  dual  norm  for  j/  e  X* 
by 

\\y\\x‘  =  sup  |y(x)|  =  sup 

xCX  iSX  ll^ll 

lk||x  =  l 

Example  1.20  When  X  =  R"  with  the  usual  dot  product  ( ,  )  and  norm  ||  ||  = 
II  II,  wc  saw  that  every  vector  v  in  R"  is  associated  with  a  linear  functional 
^u(’)  =  This  functional  is  continuous  since  |(x,u)l  <  ||u||  ||x||  (The  “C”  in 
the  definition  is  ||u||).  A  classic  result  in  linear  algebra  is  that  all  linear  functionals 
on  R”  have  this  form,  i.c.,  wc  can  make  the  identification  (R")*  s  R". 
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Example  1.21  Recall  Holder’s  inequality  for  /  €  and  g  £  L''(Q)  with 

-  +  -  =  1  for  1  <  p,  g  <  oo  is 
P  Q 

Il/S|lz,i(n)  <  ||/||£,i-(f!)llffllL«(f!)- 

This  implies  that  each  g  in  T''(n)  is  associated  with  a  bounded  linear  functional 
oil  L‘’{Q)  when  — h  -  =  1  and  1  <  p.  g  <  oo  by 

p  q 

^(/)=  /  g(x)f(x)dx. 

Ja 

An  important,  and  difficult,  result  is  that  we  can  “identify”  (L'’)*  with  L’’  when 
1  <  p,  q  <  00,  The  cases  p  =  l,q  =  oo  and  p  =  oo.q  =  1  arc  trickier.  The  case 
is  special  in  that  wc  can  identify  (L^)’  with  L^. 

There  is  a  useful  notation  for  the  value  of  a  functional.  If  x  is  in  X  and  y  is 
in  X*,  wc  define  the  bracket  of  x.and  y  as 

y{x)  =  (x,y). 

It  is  not  surprising  that  norms  on  X  and  its  dual  X*  arc  closely  related.  An 
important  inequality  is 

Theorem  1.22  The  generalized  Cauchy  inequality  is 

l(a:,y)|  <  llxll.v  II2/II.V-:  x€X,ye  X'. 

Combining  this  with  the  Hahn-Banaeh  theorem  yields  a  “weak"  representation 
of  the  norm  on  X, 

Theorem  1.23  If  X  is  a  Banach  space,  then 

IIj^II.v  =  sup  =  sup  |i/(x)l 

yex-  l|y|lx-  v6X- 
llyll.v=i 

for  all  X  in  X. 

This  says  that  wc  can  determine  the  size  of  a  vector  in  X  by  examining  a  sufficient 
number  of  “snapshots”. 

Example  1.24  In  Ex.  1.20.  wc  saw  that  R"  with  the  standard  Euclidean  norm 
can  be  identified  with  its  dual  space.  Likewise,  can  be  identified  with  its  dual 
space.  Both  of  these  spaces  arc  Hilbert  spaces. 

Remarkably,  Ex.  1.20  generalizes  to  infinite  dimensions.  If  X  is  a  Hilbert 
space  with  inner  product  {x,y),  then  each  y  £  X  determines  a  linear  functional 
iy{x)  =  {x,y)  =  {x,y)  for  x  in  X.  This  functional  is  continuous  by  Cauchy’s 
inequality,  which  says  that  l(x,y)|  <  ||x||  ||2/||. 

It  turns  out  that  is  the  only  kind  of  continuous  linear  functional  on  a  Hilbert 
space. 
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Theorem  1.25  Riesz  Representation  for  Hilbert  Spaces  For  every  bounded 
linear  functional  £  on  a  Hilbert  space  X,  there  is  a  unique  element  y  in  X  such 
that 

\E(x')  I 

£{x)  =  (x.y)  for  all  x  £  X,  and  ||i/|jx-  =  sup  ..  ..  . 

IpII 

x^O 

This  means  that  the  dual  space  to  a  Hilbert  space  A'  can  be  identified  with  X. 
Abusing  notation,  it  is  coninioii  to  replace  the  bracket  notation  and  the  gener¬ 
alized  Cauchy  inequality  by  the  inner  product  and  the  “real”  Cauchy  inequality 
without  coinincnt. 


1.3.2  The  adjoint  operator 

To  motivate  the  definition  of  the  adjoint  operator,  let  X  and  Y  be  two  Banach 
spaces.  L  :  X  —*  Y  he  &  continuous  linear  map,  and  consider  the  problem  of 
computing  a  functional  value 

f(L(*)) 

for  some  input  x  &  X.  Some  important  questions  arc 

•  Given  that  we  only  want  a  functional  value  of  the  solution,  can  we  find  a 
way  to  compute  the  functional  value  efficiently? 

•  What  is  the  error  in  the  functional  value  if  approximations  arc  involved? 

•  Given  a  functional  value,  what  can  we  say  about  i? 

•  Given  a  collection  of  functional  values,  what  can  we  say  about  L? 

We  can  address  these  questions  using  the  adjoint  operator.  Suppose  L  is  a 
continuous  linear  transformation.  For  each  y*  ^Y" , 


y*  o  L{x)  =  y*{L{x))  =<  Lx,i/  > 

assigns  a  number  to  each  x  &  X,  hence  defines  a  functional  £{x).  The  functional 
£{x)  is  clearly  linear.  It  is  also  continuous  since 

K(a:)|  =  |3/‘(L(x))l  <  l|j/-|ivimx)||v  <  ||y*|Iy. ||L||  ||xl|x  =  C||x||x, 

where  C  =  ||j/*||r-||Ll|.  By  the  definition  of  the  dual  space,  there  is  an  x*  6  X* 
such  that  2/*(L(x))  =  x’(x)  for  all  x  £  X.  x*  is  unique. 

We  have  defined  £  =  x*  implicitly.  Given  x,  we  first  apply  the  operator  L, 
then  compute  a  functional  of  the  result.  This  may  seem  a  little  strange.  We  put 
it  into  context  of  a  classic  example. 

Example  1.26  Consider  the  elliptic  problem 


— Au  =  /,  2/  €  il, 
u  =  0,  t/  6  dQ, 


(1.18) 


where  we  wish  to  evaluate  u(yo)  for  some  yo  6  fl.  In  this  example,  the  data  / 
plays  the  role  of  x  above  in  the  definition  of  £.  Note  that  we  do  not  evaluate 
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f{yo).  Instead,  wc  have  to  solve  (1.18).  where  it  =  L(f)  is  determined  by  the 
solution  operator  L  of  the  Diriehlet  problem.  We  then  apply  the  funetional 

u{yo)  =  {u,  6y„ )  =  (L(/) ;  ) . 

We  now  apply  this  implieit  definition  to  each  y*  €  Y*.  For  eaeh  y*.  assign  a 
unique  x*  €  X*  and  in  this  way  define  a  linear  transformation  L*  ;  Y*  — »  X'". 
ealled  the  adjoint  or  dual  operator  to  L. 

Example  1.27  Continuing  Ex.  1.26.  wc  pose  the  adjoint  problem 

(-A(i>  =  6y„,  yen, 

|0  =  O,  yedn, 

and  denote  the  solution  4>  =  L*(5y„).  We  have 

(it,<5,J  =  (L(/),<5,J  =  {f,L*{5yy))  =  {/,<[>). 

This  is  just  the  method  of  Greens  function. 

Note  that  wc  have  defined  the  adjoint  transformation  via  computing  snapshots 
using  elements  in  the  dual  space.  Wc  can  write  these  relations  as 

j/*(L(x))  =  L’y’ix) 


or  using  the  bracket  notation, 

(L(x),y-)  =  (x,L*(i/))  xeX,y‘eY\  (1.19) 

Equation  (1.19)  is  called  the  bilinear  identity. 

Example  1.28  Let  X  =  R"*  and  Y  =  R",  where  wc  take  the  standard  inner 
product  and  norm.  By  the  Ricsz  Representation  theorem,  the  bilinear  identity 
for  L€  £(R"‘,R'‘)  reads 

{Lx.y)  =  {x,L*y),  x  €  R''^,  y  €  R"*. 


We  know  that  L  is  represented  by  a  unique  nxm  matrix  A  so  that  if  ?/  =  L(x), 


then 

m 

2/i  =  X!  1  <  «  <  «• 

j=\ 

For  a  linear  funetional  y*  =  (i/J,  -  -  -  ,y^)^  €  Y*.  we  have 
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( 


L*y*[x)  =y*{L(x))  = 


(2/1 » *  *  *  1 2/«  )j 


V 


^^1  =  1  O^njXjy 


j-l  i=l  • 


Therefore,  L*{y')  is  given  by  the  inner  produet  with  y  =  {yi  ,  -  •  •  ,  where 


yj  —  ■ 

i=l 

This  implies  the  matrix  A*  of  L*  is 


/  “ii 

fail  0-21  * 

*  fl/il  ^ 

•  “m«/ 

We  ean  write  the  bilinear  identity  as 

Ax  =  x^A^y 


using  the  faet  that  (x,  y)  =  {y,  x). 

Wc  give  more  examples  below. 

We  eonelude  with  some  basie  facts; 

Theorem  1.29  Let  X  and  Y  be  normed  vector  spaces  and  L  .  X  —*  Y.  Then,  L’ 
is  a  continuous  linear  operator  and  ||L*||  =  ||Z.||.  Also,  0*  =  0.  If  M  ■.  X  ^  Y  is 
another  continuous  linear  operator,  then  (L  +  M)*  =  L*  +M*  and  (aL)'  =  aL” 
for  all  scalars  a. 

If  Z  is  another  normed  vector  space  and  N  :Y  —*  Z,  then  NL  ■.  X  —*  Z  is  a 
continuous  linear  operator  and  {NL)"  —  L*N* . 

1.3.3  Four  good  reasons  to  use  adjoints 

Wc  can  provide  some  immediate  motivations  to  introduce  the  concepts  of  duality 
adjoint  operators,  which  indeed  turn  out  to  be  fundamentally  important  for  the 
analysis  of  operators. 

Reason  ff:  1  X"  often  has  good  properties  that  X  may  lack. 

Theorem  1.30  If  X  is  a  normed  vector  space  overR,  thenX*  is  a  Banach 
space,  i.e.  Cauchy  sequences  in  X  converge  to  a  limit  in  X ,  whether  or  not 
X  is  a  Banach  space. 

Reason  f/:  2  There  is  a  close  connection  between  the  stability  properties  of  an 
operator  and  its  adjoint. 

Theorem  1.31  The  singular  values  of  a  matrix  L  are  the  eigenvalues  of 
the  square,  symmetric  transformations  L*L  or  LL*. 

This  connects  the  condition  number  of  a  matrix  L  to  L*. 
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Reason  #3  If  L  is  a  linear  transformation  between  nonned  veetor  spaces,  the 
solvability  of  L[y)  =  b  is  elosely  related  to  the  solvability  of  L*{(f>)  =  tJi. 

Theorem  1.32  Let  X  and  Y  be  normed  linear  spaces  and  L  :  X  Y 
a  continuous  linear  transformation.  A  necessary  condition  that  L[x)  =  y 
has  a  solution  is  that  y*{y)  =  0  for  all  continuous  functionals  y*  such  that 
L‘‘y*  =  0  .  This  is  a  sufficient  condition  if  the  range  of  L  is  closed  in  Y. 

Example  1.33  Suppose  that  L  :  R"‘  -+  K"  is  assoeiated  with  the  n  x  m 
matrix  A,  i.e.,  L(x)  =  Ax.  The  nceessary  and  sufficient  condition  for  the 
solvability  of  Ax  =  b  is  that  b  is  orthogonal  to  all  linearly  independent 
solutions  of  A^y  =  0. 

In  general,  all  kinds  of  information  about  the  solvability  and  deficiency  of 
the  linear  system  Ax  =  b  can  be  determined  by  considering  A*A.  In  the 
over-determined  and  under-determined  cases,  it  yields  a  “natural"'  defini¬ 
tion  of  a  solution  or  gives  conditions  for  a  solution  to  exist,  sec  (Lanezos, 
1997). 

Reason  4  Suppose  we  wish  to  compute  a  functional  {■,'ip)  of  the  solution  y 
of  an  inverse  problem  for  a  linear  operator  A  and  data  b, 

Ay  =  b. 

We  define  the  adjoint  (inverse)  problem 

A*</)  =  rP. 

Then  we  obtain  a  representation  of  the  solution, 

=  {y-.A*4>)  =  {Ay,(f>)  =  (b,4>). 

Such  an  error  representation  is  very  useful  in  practiec.  For  example,  we 
can  compute  the  effect  of  many  perturbations  in  the  data  very  efficiently 
by  computing  one  adjoint  solution  and  taking  inner  products  with  the 
perturbations. 

This  formal  argument  is  essentially  the  entire  foundation  for  error  estima¬ 
tion  and  uncertainty  quantification  described  in  this  chapter.  The  reader 
may  notice  that  it  generalizes  the  method  of  Greens  functions.  We  discuss 
this  further  below. 

1.3.4  Adjoint  operators  for  linear  differential  equations 

We  briefly  discuss  the  computation  of  adjoints  to  differential  equations.  On  a 
simple  level,  given  a  differential  operator  T  on  a  domain  fl,  we  seek  to  evaluate 
the  bilinear  identity, 

(Lu,u*)-(w,T*'i;*)=0,  all  ue  X,  u*  e  T*.  (1.20) 

But,  there  arc  a  lot  of  details  needed  to  make  this  a  computational  process,  c.g. 
what  docs  it  mean  to  compute  (  ,  )! 
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In  the  common  situation  in  which  wc  consider  functions  in  a  Hilbert  space 
like  wc  attempt  to  replace  (  .  )  by  the  inner  project  ( ,  ), 

(Lu.v*)  —  (u-L^v*)  =  (Lti.v)  —  (u,L*v). 

Even  using  the  familiar  inner  product,  however,  computing  an  adjoint 

can  be  tricky.  On  one  hand,  the  process  for  computing  an  adjoint  is  simple  to 
state:  multiply  the  differential  equation  by  a  test  function,  integrate  over  the 
entire  space-time  domain  (which  amounts  to  taking  the  inner  product  of  the 
differential  equation  and  the  test  function),  and  keep  integrating  by  parts  until 
all  derivatives  fall  on  the  test  function.  The  differential  operator  that  ends  up 
being  applied  to  the  test  fuirction  “is”  the  adjoint  operator.  On  the  other  hand, 
details  lead  to  all  kinds  of  technical  difficulties.  A  general  abstract  theory  is 
difficult  to  present,  see  (Lions  and  Magcncs,  1972). 

First  of  all,  the  definition  of  the  adjoint  of  a  given  forward  operator  depends 
heavily  on  the  spaces  involved  with  the  maps.  On  the  face,  the  process  described 
above  only  works  for  functions  sufficiently  smooth  that  all  the  integration  by 
parts  arc  defined.  Technically,  wc  compute  the  adjoint  for  smooth  functions  and 
then  pass  to  a  limit  (a  “density”  argument)  to  the  full  spaces  on  which  the 
operators  arc  defined. 

Second  of  all,  integration  by  parts  leaves  behind  integrals  over  boundaries  and 
these  have  to  be  accounted  for  when  defining  the  adjoint  operator.  The  reason 
is  simply  that  a  differential  operator  is  generally  under-determined  and  we  add 
boundary  and  initial  eonditions  in  order  to  get  an  invertible  operator.  Clearly, 
the  boundary  and  initial  eonditions  therefore  must  affect  the  definition  of  the 
adjoint  operator. 

To  simplify  life,  wc  compute  the  adjoint  in  two  stages.  VVe  first  assume  that 
the  functions  involved  arc  smooth  and  have  compact  support  inside  Q,  i.c.  the 
functions  and  all  their  derivatives  vanish  at  the  boundary.  In  this  way,  wc  carry 
out  the  integration  by  parts  while  ignoring  boundary  terms.  Given  a  differential 
operator  L  on  a  domain  Q,  the  formal  adjoint  L*  is  the  differential  operator 
that  satisfies 

(Lu.v)  =  (u,  L* v) 

for  all  sufficiently  smooth  u  and  v  with  compact  support  in  fl. 

Example  1.34  For 


on  [0, 1).  Integration  by  parts  neglecting  boundary  terms  gives  the  formal  adjoint 
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Example  1.35  A  general  linear  seeoiid  order  differential  operator  L  in  fl  C  R“ 
ean  bo  written 

II  It  r.O  n 

/  .  D  a  ir--\  ou 

j=l j=l  *  J  i=l  * 

where  {aij},  {6/}.  and  c  are  functions  of  xi.X2:  •  •  •  Then. 

^  ■4^  dxidxj  4^  dxi 
It  ean  bo  verified  dirootly  that 


U.  J  f 

dxt 

tssl 

du 

d(aijv)  \ 

dXj  J 

The  expression  on  the  right  is  a  divergence  expression  and  the  divergence  theorem 
yields 

/  {vL(u)  —  uL’{v))dx  =  I  p  -nds  =  0, 

Jn  Jon 

where  p  =  (pi,  • ' '  >Pn)  ^  is  the  outward  nornial  in  dO.. 

Example  1.36  Let  L  be  a  differential  operator  of  order  2p  of  the  form 

Lu=  Y. 

lal.lPI<P 


L*v=  Y  (-l)'"'0"(a^„(a:)D'’t;), 

|aMPl<P 

and  L  is  elliptic  if  aird  only  if  L*  is  elliptic.  Some  special  cases. 

grad*  =  — div 
div*  =  —grad 
curl*  =  curl 


and  if 


then 


Lu  =  ^  aa(x)D°u 

\a\<v 

L-v  =  Y  (-l)'“'^"(a„(x)u(x)). 
l“l<p 
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Ignoring  initial  as  well  as  boundary  conditions,  evolution  problems  arc  treated 
similarly  in  the  sense  that  we  integrate  by  parts  over  space  and  time.  There  is  an 
important  difFcrencc  however  because  time  has  a  direction  and  the  time  variable 
for  the  adjoint  problem  runs  “backwards.” 

Example  1.37  If  we  have  a  parabolic  problem 

Lu  =  ut  —  V  ■  (aVu)  +  bu,  a;  €  fl,  0  <  t  <  T, 


then 

L*v  =  — U(  —  V  •  (oVv)  +  bv,  X  €  fi,  r  >  t  >  0. 

The  adjoint  problem  is  also  parabolic,  and  not  an  “ill-posed”  or  “backwards” 
parabolic  problem  as  suggested  by  the  in  front  of  the  time  derivative  tenn. 
This  is  easily  seen  by  making  the  substitution  t  s  =  T  —  t,  so  that 

L*v  =  Ua  —  V  •  {a{T  —  s)Vv)  -I-  b{T  —  s)v,  x  €  Q,  0  <  s  <  T. 

We  find  it  convenient  to  use  this  change  of  variables  when  solving  the  adjoint 
problem  in  practice. 

In  the  second  stage  of  computing  the  adjoint,  we  remove  the  assumption  that 
the  functions  involved  in  evaluating  the  bilinear  identity  have  compact  support. 
The  integrations  by  parts  that  produces  the  formal  adjoint  yield  additional  terms 
involving  integrals  of  the  functions  and  their  derivatives  over  the  boundary  of 
fl.  We  then  choose  boundary  conditions  for  the  adjoint  problem  depending  on 
what  we  want  to  happen  with  the  boundary  terms  from  evaluating  the  bilinear 
identity. 

For  example,  the  standard  approach  is  to  pose  the  minimal  boundary  condi¬ 
tions  on  the  adjoint  problem  necessary  to  make  the  boundary  terms  that  appear 
when  evaluating  the  bilinear  identity  vanish.  These  arc  called  the  adjoint  bound¬ 
ary  conditions.  This  definition  is  rather  vague,  but  it  can  be  made  completely 
precise,  see  (Lions  and  Magcncs,  1972). 

Note  that  for  the  purpose  of  defining  the  adjoint  boundary  conditions,  the 
form  of  the  boundary  conditions  imposed  on  the  original  operator  L  arc  impor¬ 
tant,  but  the  values  given  for  these  conditions  arc  not.  If  the  boundary  conditions 
for  L  arc  not  homogeneous,  we  make  them  so  for  the  purpose  of  determining  the 
adjoint.  It  follows  that  some  of  the  boundary  terms  that  appear  when  evaluating 
the  bilinear  identity  vanish  because  of  the  homogeneous  boundary  conditions 
imposed  on  L  and  the  adjoint  boundary  conditions  insure  that  the  rest  vanish. 

Example  1.38  Consider  Newton’s  equation  of  motion  s"{t)  =  f{t),  normalized 
with  mass  1.  If  we  assume  s(0)  =  s'(0)  =  0,  and  0  <  t  <  1,  then  we  have 

s"v  —  sv"  =  —  sv') 


and 
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f  ^ 

I  {s"v  —  sv")  dt  =  (vs'  —  . 

Jo 


Now  the  boundary  conditions  imply  the  contributions  at  t  =  0  vanish,  while  at 
t  =  1  wc  have 

t;(l)s'(l)-n'(l)s(l). 

To  insure  this  vanishes,  we  must  have  t)(l)  =  u'(l)  =  0.  (We  cannot  specify  s(l) 
or  s'(l)  of  course.)  These  arc  the  adjoint  boundary  conditions. 

Example  1.39  Since 

f  {uAv  —  vAu)dx=  f  ( - I  ds. 

Jn  Jon  \  on  dnj 

the  Dirichlct  and  Neumann  boundary  value  problems  for  the  Laplaciau  arc  their 
own  adjoints. 

Example  1.40  Let  C  be  bounded  with  a  smooth  boundary  and  let  s  = 
arclcngth  along  the  boundary.  Consider 


— Au  =  /,  I  €  f2, 

t  +  t  =  x€dn. 


-  vAu)d.  =  (I  -  I)  -  .  (I  +  g)  ) 


the  adjoint  problem  is 


-Av  =  g.  X  €  n, 
x€dQ. 


1.4  A  posteriori  error  analysis  using  adjoints 

We  now  apply  functionals,  adjoint  operators,  and  variational  analysis  to  the 
problem  of  estimating  the  error  of  a  finite  element  solution  of  a  partial  differential 
equation.  The  analysis  rests  on  the  observation  in  Reason  ff  4  above  and  we  begin 
by  extending  that  argument  to  differential  equations.  Given  a  domain  Q,  which 
could  be  a  time  interval,  a  space  domain,  or  a  space-time  domain,  we  consider  a 
problem  of  the  form 


Lu  =  /,  on  n, 

bound,  cond.  and  init.  val.,  on  90, 


(1.22) 


where  L  is  a  linear  differential  operator  and  we  specify  the  correct  boundary 
and/or  initial  conditions  so  that  (1.22)  has  a  unique  solution.  We  assume  that 
the  goal  of  solving  (1.22)  is  to  compute  a  quantity  of  interest  given  as  a  linear 
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functional  ^{u)  =  {u,ip)  for  some  ip.  The  gencralixecl  Greens  function  for  (1.22) 
corresponding  to  ip  satisfies 


L*(p  =  Ip,  on  12; 

adjoint  bound,  cond.  and  init.  val.,  on  dfl, 


(1.23) 


where  L*  is  the  formal  adjoint  of  L.  There  are  minor  variations  of  this  definition  if 
we  pose  the  data  ip  on  the  boundary  of  12  rather  than  the  interior  (i.e.,  as  bound¬ 
ary  or  initial  data),  see  (Wildcy  et  al,  2008).  VVe  obtain  the  basic  representation 
formula, 

(u,ip)  =  (u-Urp)  =  {Lu,<p)  =  (/,(/>). 

We  use  this  argument  to  derive  an  a  posteriori  error  estimate. 

Example  1.41  We  begin  by  returning  to  Ex.  1.6,  and  estimating  the  error  e  = 
X  —  X  in  the  numerical  solution  X  of  a  linear  system  of  equations 


Ax  =  b. 


We  derive  an  estimate  of  the  error  in  a  quantity  of  interest  given  by  a  linear 
funetional  (e,  V)).  where  ip  is  an  given  veetor.  We  introduee  the  generalized  Greens 
veetor  solving  the  adjoint  problem 


A'^  (p  =  Ip. 


Arguing  as  above. 


(,e,iP)  =  {e,A'^<P)  =  {Ae,<P)  =  (R,cP).. 

where  R  =  AX  —  6.  We  obtain  a  representation  of  the  error  as  an  inner  product 
of  the  computable  residual  and  the  solution  of  the  adjoint  problem.  In  practice, 
we  approximate  <p  and  obtain  a  computable  estimate. 

We  can  also  derive  a  bound 

|(e,V-)|<|HI||i?||.  (1.24) 

Returning  to  the  specific  example  in  Ex.  1.6,  we  find 

estimate  of  the  error  in  the  quantity  of  interest  ks  1.0  x  10”*^, 
a  posteriori  error  bound  for  the  quantity  of  interest  w  5.4  x  10“*'*, 
traditional  error  bound  for  the  error  a:  3.5  x  10“^. 

The  a  posteriori  estimate  is  very  accurate.  The  a  posteriori  bound  overestimates 
the  error  since  any  cancellation  in  the  inner  product  {R,  (p)  is  lost,  but  it  is  still 
much  better  than  the  traditional  condition  number  bound. 
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The  adjoint  quantity  ||(/)|1  is  called  the  stability  factor.  It  is  related  to  the 
condition  number  of  A.  since 

where 

cond^(/l)  =  |Hl  |M|i  =  i|A-T^&||||/l|| 

is  a  kind  of  “weak"  condition  number  of  A  with  respect  to  the  targeted  quantity 
of  interest.  If  we  take  the  supremum  of  cond  ^(i4)  over  all  possible  ip  with  norm 
1.  we  obtain  the  standard  condition  number  of  A.  Hence,  the  stability  factor 
obtained  from  the  generalized  Greens  function  is  a  measure  of  the  sensitivity  of 
particular  information  computed  from  a  numerical  solution  of  the  problem  to 
computational  errors. 


1.4.1  Discretization  of  elliptic  problems 


We  first  consider  a  general  second  order  linear  elliptic  boundary  value  problem 
for  a  scalar  unknown, 


Lu  =  f,  X  €  f2, 
u  =  0,  X  € 


(1.25) 


where 

L{D,x)u  =  -V  •  a(x)V«  +  6(x)  •  Vu  +  c(x)u(x),  (1’2C) 

with  u  :  R"  R,  a  is  a  n  X  n  matrix  function  of  x,  6  is  a  n-vcctor  function  of 
X,  and  c  is  a  function  of  x.  We  assume  that  fi  C  R”,  n  =  2,3,  is  a  smooth  or 
polygonal  domain;  a  =  (ajj),  where  a,j  are  continuous  in  fl  for  1  <  i.j  <  n  and 
there  is  a  ao  >  0  such  that  v'^av  >  Oq  for  all  v  €  R"  \  {0}  and  x  €  fl;  6  =  (6,-) 
where  6,-  is  continuous  in  fj;  and  finally  c  and  /  arc  continuous  in  fl. 

We  discretize  (1.25)  by  applying  a  finite  clement  method  to  the  associated 
variational  formulation: 


Find  u  e  Mq{EI)  such  that 

A{u.,v)  =  (aVu,  Vu)  +  (6  •  Vu,v)  +  (cu,v)  =  (/, u)  for  all  v  G  Hq(Q),  (1-27) 

where  Dq(Q)  is  the  subset  of  functions  in  ff‘(Q)  that  arc  zero  on  dQ  and  H^(Q) 
consists  of  functions  that  together  with  their  first  derivatives  arc  square  intc- 
grablc  on  Q. 

To  construct  a  finite  clement  discretization,  we  form  a  piecewise  polygonal 
approximation  of  dQ  whose  nodes  lie  on  dQ  and  which  is  contained  inside  fl. 
This  forms  the  boundary  of  a  convex  polygonal  domain  Qh-  We  let  Th  denote 
a  simplex  triangulation  of  Qh  that  is  locally  quasi-uniform.  We  let  Hk  denote 
the  length  of  the  longest  edge  of  A'  G  7),  and  define  the  piecewise  constant  mesh 
function  h  by  h{x)  =  Hk  for  x  G  K.  We  also  use  h  to  denote  max^-  hp;.  We  choose 
a  finite  clement  solution  from  the  space  Vh  of  functions  that  arc  continuous  on 
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Q.  piecewise  linear  on  Qk  with  respect  to  7^.  zero  on  the  boundary  dQht  find 
finally  extended  to  be  zero  in  the  region  Q  \  ilg-  With  this  construction,  we 
have  Vh  C  and  for  smooth  functions,  the  error  of  interpolation  into  14 

is  0(/i^)  ill  II  II,  but  not  better.  The  finite  element  method  is: 

Compute  U  eVh  such  that  A(U,v)  =  (/,  n)  for  all  v  eV^.  (1-28) 

In  these  notes,  we  take  for  granted  the  usual  a  priori  convergence  results 
for  finite  element  methods  and  concentrate  on  the  a  posteriori  analysis  used  to 
produce  computational  error  estimates.  In  particular,  by  standard  results,  we 
know  that  U  exists  and  converges  to  u  as  /i  — >  0. 

1.4.2  A  posteriori  analysis  for  elliptic  problems 

The  goal  of  the  a  posteriori  error  analysis  is  to  estimate  the  error  in  a  quantity 
of  interest  (u,  Vi)  computed  from  the  finite  element  solution  U.  To  do  this,  we 
use  a  generalized  Greens  function  (j)  solving  the  adjoint  problem  eorresponding 
to  tp, 

Find  4>  S  such  that 

/l*(e,  (p)  =  (Vw,  o,V0)  -  (tj,div(6<ji))  +  {v,ap)  =  (w,  ip)  for  all  v  e  /fo(f2). 

(1.29) 

This  is  just  the  weak  form  of  the  adjoint  problem  L*{D,x)(p  =  ip.  Extending  the 
analysis  above, 

{c,ip)  =  {Ve,  aV(p)  -  {e,  div  {b4>))  +  (e,  c<p) 

=  {aVe,V4>)  +  {b  ■Ve,4>)  +  {ce,(p) 

=  (aVu,  V<p)  +  {b  ■  Vu,  (P)  +  (eu,  0)  -  {aVU,  V(p)  -  (b  ■  VU.,  <p)  -  {cU,  p) 
=  (/,  <P)  -  {aVU;  Vp)  -  (6  ■  Vf/,  p)  -  (cU,  P). 

Letting  ksP  denote  an  approximation  of  p  in  14.,  using  Galcrkin  orthogonality 
(1.28),  we  conclude 

Theorem  1.42  The  error  in  the  quantity  of  interest  computed  from  the  finite 
element  solution  (1.28)  satisfies  the  error  representation. 


{e.p)  =  {f,p-itup)-(aVU,V{p-nhp))-(b-VU,p-nhP)-{cU,p-T:uP):  (1-30) 

where  the  generalized  Greens  function  p  satisfies  the  adjoint  problem  (1.29)  eor¬ 
responding  to  data  p. 

The  most  accurate  a  posteriori  error  estimates  are  obtained  by  using  (1.30) 
directly  as  opposed  to  making  further  estimates.  To  use  the  estimate,  we  ap¬ 
proximate  p  using  a  finite  element  method.  Since  p  —  n^p  ~  D°‘p  where 

p  is  smooth,  we  use  a  higher  order  finite  element  than  that  used  to  solve  the 
original  boundary  value  problem.  For  example,  good  results  are  obtained  using 
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the  space  of  continuous,  picccwisc  quadratic  functions  with  respect  to  T^. 
The  approximate  generalized  Greens  function  is 


Compute  $  G  V)^  such  that 

A*{v,  $)  =  (Vu,  aV$)  —  (u,  div(6$))  +  (w,  c^>)  =  (u,  for  all  v  6  V^.  (1.31) 

The  approximate  error  representation  is 

(e,  tP)  «(/,$-  7r,,5>)  -  {aVU,  V(^«  -  -  (6  ■  Vf/.  $  -  7r,,$)  -  (cU,  $  -  Wh<^) . 

(1.32) 

Example  1.43  In  (Estep  et  ai.  2002),  we  estimate  the  error  in  the  average 
value  of  the  solution  of 

f— An  =  200sin(107rx)sin(107ri/);  (x,y)  6  fl  =  [0,1]  x  [0.  l], 

=  0,  •  (Xty)  e  dQ 

The  solution  is  u  —  sin(107rx)  sin(107rj/),  see  Fig.  1.15. 

In  Fig.  1.15,  we  show  a  plot  of  crror/cstimatc  ratios  for  various  degrees  of 
accuracy.  Ideally,  we  would  get  a  ratio  of  1.  In  practice,  the  accuracy  of  the 
estimate  is  affected  by  the  numerical  error  in  the  adjoint  and  the  errors  arising 
from  quadrature  applied  to  the  integrals  in  the  representation  (1.32).  At  the 


Fig.  1.15.  The  solution  u  =  sin(107rx)sin(10;ry)  and  a  plot  of  crror/cstimatc 
ratios  for  various  mesh  sizes. 

inaccurate  end,  we  arc  using  meshes  with  5  x  5  to  10  x  10  elements.  We  emphasize 
that  the  computed  numerical  solution  bears  almost  no  resemblance  to  the  true 
solution  at  those  discretization  levels,  yet  the  estimate  is  reasonably  accurate. 

Note  that  in  practice,  the  error  to  estimate  ratio  tends  to  vary  quite  a  hit, 
even  in  the  best  of  circumstances.  The  accuracy  of  the  estimate  is  affected  by 
numerical  considerations  including  the  accuracy  of  the  computed  adjoint  solution 
and  the  use  of  quadrature  to  evaluate  the  integrals  yielding  the  error  estimate. 
In  nonlinear  problems,  it  turns  out  that  there  is  a  linearization  error  that  may 
be  significant. 
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1.4.3  Adjoint  analysis  for  nonlinear  problems 

So  far  ill  the  discussion  of  a  posteriori  analysis,  vve  have  treated  linear  problems. 
This  is  no  coincidence  because  the  notion  of  an  adjoint  to  an  operator  explicitly 
depends  on  the  linearity  of  the  operator,  which  means  in  particular  that  the  op¬ 
erator  is  independent  of  the  input.  This  in  turn  implies  that  the  bilinear  identity 
serves  to  determine  a  unique  adjoint  operator.  On  reflection,  this  fact  is  apparent 
in  the  computations  in  Ex.  1.28. 

Above,  we  derived  an  a  posteriori  error  estimate  for  a  linear  problem  by  modi¬ 
fying  the  representation  formula  involving  the  generalized  Greens  vector/function. 
Subtracting  the  representation  formulas  for  a  solution  and  an  approximation 
leads  to  a  representation  of  the  error.  Now  we  apply  the  adjoint  analysis  tech¬ 
nique  directly  to  the  problem  of  deriving  an  error  estimate  for  an  approximate 
solution  of  a  nonlinear  problem. 

We  assume  that  F  :  X  —*  Y  is  a  nonlinear  map  between  Banach  spaces 
X,  Y  with  a  convex  domain  'D{F).  Convexity  is  a  typical  assumption,  with  one 
important  consequence  being  that  mean  value  theorems  hold.  We  let  u  e  V^F) 
solve  the  nonlinear  problem 

F(u)  =  b,  (1.33) 

for  some  data  6  €  T  in  the  range  of  F.  We  let  f/  «  u  be  an  approximate  solution, 
where  U  €  V{F).  The  nonlinear  residual  of  U  is 

R{U)=:  F{U)-b. 

With  e  =  U  —  tt,  we  have 

F{U)-F{u)  =  R{U).  (1.34) 

Now  we  write  U  =  u  +  e  and  define  the  operator 

£{e)  =  €{e\u)  =  F(u  +  e)  —  F{u), 

where  £(0)  =  0.  The  fundamental  observation  is  that  if  F  is  smooth,  c.g.  Frcchct 
diff’crcntiablc,  then  £{e)  w  F'{u)e  behaves  linearly  in  e  to  first  order  when  e  is 
small.  Hence,  it  makes  sense  to  try  to  define  an  adjoint  to  £(e).  The  domain  of 
£  is 

P(^)  =  {v€  A'|«  +  neP(F)}. 

To  be  technically  precise,  we  assume  that  'D{£)  is  a  dense  vector  subspacc  of  X 
and  independent  of  e. 

We  now  define  the  adjoint  operator  £*  through 

{£{v),w)  =  {v,£{vyw),  for  all  sufficiently  small  v  €  '£>{£)  and  all  w  e  'D{£*). 

(1.35) 

(Note  the  notation  is  somewhat  confusing,  since  £{v)  is  an  operator  applied  to 
V  while  £{vy  is  an  operator.)  This  gives  the  basic  representation  formula  that 
is  useful  for  error  estimation.  In  the  Sobolev  space  setting,  we  realize  this  as 

(£(v),w)  =  (v,£(v)'w),  for  all  sufficiently  small  v  e  £>(£)  and  all  w  e  £>(£*). 
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Example  1.44  Consider  the  map  f  :  given  by 


n  \  +  3w2'\ 


It  is  easy  to  eompute 


£{£)=^  F{u  +  £)-  F{u)=  ^ 


2ui£i  +  £2  d" 

«i  e“^(e^*  —  1)  +  £-1 


?£2  \ 
^eU2+£2  J 


Wo  can  write 


Evaluating  (1.35)  yields 


2ui  +  W]  3 


f,U2  +  V-2  gUQ 


(V))  G)  ■ 


/2ui+v,  e“2+‘'^ 

=[3  u,  e“^  (^) 


In  the  limit  of  small  v,  wc  recognize  that  ~  {F'{u))‘,  where  F'{u)  is  the 
Jacobian  of  F  at  u. 


-  (e«  .,!■.)  ■ 


This  example  suggests  one  systematic  way  to  define  an  adjoint  operator  for 
error  analysis.  When  F  is  Frcchct  differentiable,  wc  use  the  Integral  Mean  Value 
Theorem  to  write 


£(e)  =  F{u+e)-F{u) 


=  (/>' 


(u  +  se)  ds 


If  wc  define  the  “average”  Jacobian 


^  e  =  F'(su  +  (1  —  s)U)  ds^  e. 

(1.3G) 


F'=  /  F'{u  +  se)ds=  f  F'{su+ {1  -  s)U)  ds, 
Jo  Jo 


then  wo  can  use  £{0)"  =  (F')*  as  an  adjoint  in  the  analysis. 

In  much  of  the  literature  on  a  posteriori  error  analysis  for  nonlinear  problems, 
the  standard  way  to  define  an  adjoint  operator  is  to  use  the  Integral  Mean  Value 
Theorem  approach.  Note  that  in  practice,  u  is  typically  unknown  so  F'  is  not 
computable.  Typically,  we  simply  linearize  around  U,  i.c.  replace  F'  — »  F'{U). 
This  may  be  an  issue  if  u  and  U  arc  sufficiently  far  apart  that  they  are  associated 
with  significantly  different  adjoint  operators. 
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Example  1.45  Wc  continue  Ex.  1.44  by  examining  the  invertibility  of  £(vy. 
We  can  use  column  operations  to  obtain  the  triangular  matrix 


(3-(2u,+n,)n, 

Thus,  the  invertibility  of  £(vy  is  determined  by  the  distanee  of  3  —  (2ui  + 

vOni  to  0. 

Expanding  this  quadratic  function  in  tti,  wc  find  that  the  roots  arc  equal  to 
±  y/3/2  when  vi  =  V2  =  0  and  nearby  for  small  ni ,  i;2-  If  wc  bound  Ui  away  from 
these  critical  values, 

,3  , 

>  0, 

for  some  constant  c,  wc  find  that 


3  —  (2rii  +  vi 


cxp(— 02) 
V2 


>  '1<?  -  |ui|0(|ui|)  -  |tii|^0(|i;2|) 


Wc  conclude  that  there  is  a  constant  c  such  that  is  uniformly  invertible 

for 


|W2|  < 


c 


Hence,  linearization  around  two  points  with  nearby  values  of  ri)  produces  adjoint 
operators  with  nearly  the  same  stability  (invertibility)  properties. 

On  the  other  hand,  if  |wi|  «  y^3/2  then  there  are  critieal  values  of  Vi  and  V2 
whieh  make  the  operator  £(1;)’  non-invcrtible.  In  this  case,  linearization  around 
two  points  that  are  near  ui  ss  ±\/3/2  may  yield  adjoint  operators  with  substan¬ 
tially  different  stability  properties. 


We  also  note  that  (1.35)  does  not  define  a  unique  adjoint  operator  in  general. 


Example  1.46  Suppose  that  £(e)  can  be  written  as  £(e)  =  i4(e)e,  where  A(e) 
is  a  linear  operator  with  £>(£)  C  'D{A).  For  a  fixed  e  £  'D(£),  wc  can  define  the 
adjoint  of  A  satisfying  {A{e)w,v)  =  {vj.A’{e)v)  for  all  w  6  'D{A).,  v  6  V{A’)  as 
usual.  Substituting  w  =  e  shows  this  defines  an  adjoint  of  £  as  well.  If  there  arc 
several  such  linear  operators  A.  then  there  arc  generally  several  different  possible 
adjoints. 

Following  (Marchuk  et  al.:  1996),  for  {t,x)  €  fl  =  (0,1)  x  (0,1),  wc  let 
X  =  X*  =  Y  =  Y*  =  equal  to  the  space  of  periodic  functions  in  t  and  x  with 
period  1.  Consider  the  Burgers  equation 

ut  -1-  MUj;  -1-  ou  =  /,  (1-37) 

where  a  >  0  is  constant  and  /  is  a  periodic  function,  and  we  apply  periodic 
boundary  conditions  on  Q.  Straightforward  computation  yields 
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^  ^  de  ,  .de,  . 

~'di^  + 


Wc  have  E{e.)  =  y4i(e)e,  where 


.  /  V  dv  ,  .dv  , 

Ai  (e)v  =  —  +  (tt  +  e)—  +  (wi  +  a)v 


and  the  adjoint  is 


.  ,  dw  d((u  +  e)w) 

A,(e)w  =  -— - - - +  K.  +  a)n;. 


We  also  have  £(e)  =  A2(e)e.  where 


...  dv  dv  ,  . 

A2{e)v  =  ^  +  +  ei  +  a)v 


-  /  9w  diuw) 

A2{e)  w  =  - +  a)w. 

Returning  to  the  original  problem,  once  the  adjoint  is  defined,  wc  can  derive 
a  representation  formula  for  the  error.  In  the  Sobolev  space  setting,  wc  note  that 

(/?([/), w)  =  {F{U)  -  F{u):w)  =  {S{e),w)  =  (e,£(e)*w). 

To  estimate  the  error  in  a  quantity  of  interest  (c,  i/i).  wc  let  the  generalized  Greens 
function  (p  solve 

f(e)*(^  =  i>, 


and  wc  obtain 


{c,i,)  =  {R{U)..<p). 


1.4.4  Discretization  of  evolution  problems 

Wc  consider  a  reaction-diffusion  equation  for  the  solution  u  on  an  interval  [0,  T], 

ft  —  V  •  (e(x.  t)Vw)  = /(u,  X,  f),  (x,  f)  6  n  X  (0,T], 

-rt(x,t)=0,  (x,f)  €  on  X  (0,r],  (1.38) 

u(x,  0)  =  uo(x).  X  e  n. 


where  Q  is  a  convex  polygonal  domain  in  with  boundary  dfl,  u  denotes  the 
partial  derivative  of  u  with  respect  to  time,  and  there  is  a  constant  e  >  0  such 
that 

e(x,  t)  >  e,  X  6  n,  f  >  0. 

Wc  also  assume  that  e  and  /  have  smooth  second  derivatives  and  for  simplicity, 
wc  write  f(u,  x,  t)  =  f(u).  Everything  in  this  paper  extends  directly  to  problems 
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with  different  boundary  conditions;  convection,  nonlinear  diffusion  coefficients, 
and  systems  of  equations,  see  (Estep  et  ai,  2000). 

Wc  describe  two  finite  element  space-time  discretizations  of  (1.38)  called  the 
continuous  and  discontinuous  Galcrkin  methods,  sec  (Estep,  1995;  Estep  and 
French,  1994;  Estep  et  ai,  2000;  Estep  and  Stuart,  2002).  Wc  can  represent  many 
standard  finite  element  in  space  -  finite  difference  scheme  in  time  methods  as  one 
of  these  two  methods  with  an  appropriate  choice  of  quadrature  for  evaluating 
the  intcgi'als  defining  the  finite  element  approximation.  Wc  partition  [0,  T]  as 
0  =  to  <  <  t2  <  •  •  •  <  t,i  <  •  •  •  <  tw  =  T,  denoting  each  time  interval  by 

In  =  (tn-i ,tn\  and  time  step  by  A:,,  =  t„  —  t i .  Wc  use  k  to  denote  the  piecewise 
constant  function  that  is  A,,  on  /„.  Wc  discretize  Q  using  a  set  of  elements  T 
as  described  in  See.  1.4.1.  Wc  describe  the  notation  when  the  space  mesh  is  the 
same  for  all  time  steps.  In  general,  wc  can  employ  different  meshes  for  each  time 
step. 

The  approximations  arc  polynomials  in  time  and  piecewise  polynomials  in 
space  on  each  space-time  “slab”  S„  =  fl  x  /„.  In  space,  wc  let  V  C  Ffo(n) 
denote  the  space  of  piecewise  linear  continuous  functions  defined  on  T,  where 
each  function  is  zero  on  dfl.  Then  on  each  slab,  wc  define 

f  “ 

=  <  w{x,  t)  :  w(x,  t)  =  ^2  ^^tjj(x),  Vj  e  V,  (x,  t)  € 

[  j=o 

Finally,  wc  let  W‘‘  denote  the  space  of  functions  defined  on  the  space-time  domain 
n  X  [0,  T]  such  that  d|5„  €  for  n  >  1.  Note  that  functions  in  W'‘  may  be 
discontinuous  across  the  discrete  time  levels  and  wc  denote  the  jump  across 
by  (ti)],,  =  -  w~  where  -  lims_(^±  w{s). 

Wc  use  a  projection  operator  into  V,  Pv  €  V,  c.g.  the  l}  projection  satisfying 
{Pv,w)  =  {v,w)  for  all  w  €  V,  where  (•,•)  denotes  the  L2{^1)  inner  product.  Wc 
use  the  II  II  for  the  L2  norm.  Wc  also  use  a  projection  operator  into  the  piecewise 
polynomial  functions  in  time,  denoted  by  rr,,  :  L^(/„)  — *  where  V'‘[In)  is 

the  space  of  polynomials  of  degree  q  or  less  defined  on  In.  The  global  projection 
operator  n  is  defined  by  setting  tt  =  7r„  on  S„. 

The  continuous  Galcrkin  cG(q)  approximation  U  6  W*  satisfies  =  PqUo 
and 

r  {[U,v)  +  {eVU,Vv])dt=  f"  {f{U),v)dt 

Jtn-l  Jtn-l 

for  all  V  e  1  <  n  <  JV, 

(1.39) 

Note  that  U  is  continuous  across  time  nodes. 

The  discontinuous  Galcrkin  dG(q)  approximation  U  €  W’  satisfies  = 
Puo  and 
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j'j  {{U.v)  +  {tVU.yv))dt^{[UU-uv^)=  j'"  U[U).v)dt 

for  all  V  6  W^.  1  <  n  <  N.  (1.40) 

Note  that  the  true  solution  satisfies  both  (1.39)  and  (1.40). 

Example  1.47  To  illustrate,  wc  discretize  the  scalai’  problem 

u  —  Aw  =  /(w),  {x.  t)  6  n  X  R"*", 

<  w(x,t)  =  0,  (i.t)  €  X  R+j  (1.41) 

u(i,  0)  =  uo{x),  X  €  Cl., 

using  the  dG(0)  method.  Sinee  U  is  eonstant  in  time  on  each  time  interval,  we  let 
U~  denote  the  M  vector  of  nodal  values  with  respect  to  the  nodal  basis 
for  V.  Wc  let  B  :  (S)-  =  for  1  <  i,j  <  M  denote  the  mass  matrix  and 

A  :  {A)  .j  =  (Vr/i,  Vt/j)  denote  the  stiffness  matrix.  Then  f/„  satisfies 

(B  +  k,A)U-  -  F(U-)k^  =  BU-_„  l<n<N, 
where  (F(f/-));  =  (/(f/-),7j,). 

As  mentioned,  with  an  appropriate  use  of  quadrature  to  evaluate  the  integrals 
in  the  variational  fonuulation,  these  Galcrkin  methods  yield  standard  difference 
schemes.  Wc  write  these  standard  numerical  methods  as  spacc-tinic  finite  clement 
methods  in  order  to  make  use  of  adjoints  and  variational  analysis. 

Example  1.48  In  the  example  above,  if  the  lumped  mass  quadrature  is  used  to 
evaluate  the  coefficients  of  B,  then  the  resulting  set  of  equations  for  the  dG(0) 
approximation  is  the  same  as  the  equations  for  the  nodal  values  of  the  backward 
Euler  -  second  order  centered  difference  scheme  for  (1.41). 

The  dG(0)  method  is  related  to  the  backward  Euler  method,  the  cG(l)  method 
is  related  to  the  Crank-Nicolson  scheme,  and  the  dG(l)  method  is  related  to  the 
third  order  sub-diagonal  Fade  difference  scheme,  sec  (Jamet,  1978;  Dclfour  and 
Dubcau,  1986;  Dclfour  et  al.,  1981;  Thomcc,  1980;  Eriksson  et  ai,  1985;  Estep 
and  Larsson,  1993;  Estep,  1995;  Estep  and  French,  1994;  Estep  and  Stuart,  2002). 

Under  general  assumptions,  the  cG(q)  and  dG(q)  have  order  of  accuracy  9+ 1 
in  time  and  2  in  space  at  any  point.  In  addition,  they  enjoy  a  supcrconvcrgcncc 
property  in  time  at  time  nodes.  The  dG(q)  method  has  order  of  accuracy  2q+\ 
in  time  and  the  cG(q)  method  has  order  2q  in  time  at  time  nodes  for  sufficiently 
smooth  solutions. 

1.4.5  Analysis  for  discretizations  of  evolution  problems 

Wc  begin  the  a  posteriori  analysis  by  defining  a  suitable  adjoint  problem  for 
error  analysis.  The  adjoint  problem  is  a  linear  parabolic  problem  with  coeffi¬ 
cients  obtained  by  linearization  around  an  average  of  the  true  and  appro.ximatc 
solutions. 
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/  =  /(u,  U)  =  I'  ^(us  +  U{1  -  s))  ds.  (1.42) 

The  regularity  of  u  and  U  typieally  imply  that  /  is  pieeewise  eontinuous  with 
respeet  to  t  and  a  eontinuous,  //*  function  in  space. 

Written  out  pointwisc  for  convenience,  the  adjoint  problem  to  (1.38)  for  the 
generalized  Greens  function  associated  to  the  data  ijj,  which  determines  the  quan¬ 
tity  of  interest, 


/  (u,  0)  dt, 

Jo 

.(1.43) 

—  0  —  V  •  (eV0) 

—  f<j)  =  0,  (x,  t)  €  fl  X  (T,  0). 

II 

O 

(x,t)  €  5G  X  (r,0], 

(1.44) 

0(x,r)  =0, 

X  €  fl, 

Using  this  definition,  for  the  dG  method  we  have 


f  (e,  -  V  •  (eV(^)  -  f(l>)  dt 

Jo 

^  r 

T  /  (e,-0-V-(£V0)-/0)dt. 

Ju 


We  integrate  by  parts  in  time  for 

(e, -<^)dt  = -(e, 7,0,,)  + (e+_j, «!)„_,) +  ^  {c,(p)dt. 


Likewise. 


Finally, 


^  (e,-V-(£V0))dt  =  ^  {eVe,V4>)dt. 


{c,f(l))dt  =  (/e,0)dt  =  ^  {f{U)  -  f(u),(p)d.t. 

Next  we  realize  that  the  true  solution  satisfies  the  weak  formulation 
^  ((u,  0)  +  (eVu,  V0)  -  (/(u),  0))  dt  =  0, 


hence, 
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The  sum  of  terms  arising  from  the  integration  by  parts  in  time  simplifies 
N  N-\ 

^  ^  (  ~  ■■  (e^_| ;  0ft- 1 ))  —  (Cq  1  <^o)  ~i~  ^  ]  (c„  ~  ~  (^iV: 

fl=l  ft=l 

and  then  simplifies  further  upon  realizing  that  u~  =  and  4>n  =  0.  Using  the 
definition  (1.40)  for  the  dG  method.,  we  obtain 
Theorem  1.49 


rT  N 

/  (e. V>)  dt  =  ((/  -  P)no,m)  +  53([t/],-„  (jrPei  -  ^)t_J 

•'0  j=l 

+  [  {iU,nP4,-4,)  +  {f.iU)S7U.y(wP4>-4>))-ifiU),-!TP4>-4>))dt.  (1.45) 
Jo 

The  initial  error  is  e“(0)  =  (/  —  P)u,). 

If  instead  we  desire  to  estimate  (u(T),ip),  for  a  funetion  ip,  then  the  adjoint 
problem  is 

{-0-v  (eV«i)-/^  =  o,  (i,t)  €  n  X  (r,o], 

(p(x,t)  =  0,  (x,t)  edQx  (T,0],  (1.4C) 

<P{x,T)  =  Ip,  X  €  n. 

The  resulting  estimate  is' 

Theorem  1.50 


N 

(eiT),  iP)  =  ((/  -  P)uo,  m)  +  i^P4>  -  J>)U) 

j~l 

T 

+  [  {{U,irP4>-4>)  +  {e{U)VU,V{irP4>-4>})-{f{U},nP,p-4,))dt.  (1.47) 
Jo 

A  similar  argument  for  the  cG  method,  say  for  the  global  quantity  of  interest 
(1.43),  yields 

Theorem  1.51 


r{e,iP)dt  =  {iI-P)uo,m) 

Jo 

y  ■ 

+  {{U,nPJ,-<P)  +  {e{UWUy(wP(P-(P))-{f(U)\7rP<p-4>))dt.  (1.48) 

Jo 


In  praetice,  wo  eompute  a  numcrieal  solution  of  a  linear  adjoint  problem  ob¬ 
tained  from  (1.44).  Typically,  we  linearize  around  the  computed  approximate 
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solution  and  solve  using  a  higher  order  method  in  space  and  time.  Without  spec¬ 
ifying  the  details,  we  denote  the  approximate  adjoint  solution  by  4>.  Focussing  on 
the  dG  method,  where  application  to  the  cG  method  is  obvious,  the  approximate 
a  posteriori  error  estimate  then  reads 


:E{U}  =  E(U;^) 

N 


[  (e. 

I  Jo 

=  |((/  -  PW.m)  +  ^((t/l;_n (7rP4>  -  4>)t_,) 

'  j=i 

+  [  {{il,  -  4-)  -I-  (t(U)VU,  V(7rP$  -  $))  -  UiU):  7rP4>  -  $))  dt 

Jo 


(1.49) 


Example  1.52  In  (Sandclin.  2006),  we  consider  the  accuracy  of  the  a  posteriori 
error  estimate  applied  to  the  chaotic  Lorenz  problem 


111  =  — lOui  +  10u2, 

112  =  28ui  -  U2  —  U1U3,  0  <  t. 

I  “3  =  -fwa  +  U1U2, 

[ui(0)  =  -6.9742,142(0)  =  -7.008,143(0)  =  25.1377. 

The  terms  in  (1.49)  describing  space  discretization  simply  drop  out  in  this  case, 
and  we  compute  the  resulting  estimate. 

In  Fig.  1.16,  we  show  the  accuracy  of  the  o  posteriori  error  estimate  for 
poiutwisc  values  of  each  component  at  many  times.  Similar  accuracy  is  obtained 


Time 


Fig.  1.16.  Left;  We  plot  an  approximate  error  in  each  component  of  a  numerical 
solution  of  the  Lorenz  problem  computed  by  taking  the  difference  between 
solutions  with  estimated  error  .001  and  .0001  at  many  times.  Right:  we  plot 
the  pointwisc  crror/cstimate  ratios  for  each  component  versus  time  at  many 
time  points. 


for  other  functionals,  c.g.  average  error. 
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Wc  illustrate  the  idea  that  the  solution  of  the  adjoint  problem  provides  a  kind 
of  eondition  number  for  the  eomputed  solution.  Following  (Estop  and  Johnson, 
1998);  in  Fig.  1.17.  wo  show  that  the  adjoint  solution  grows  very  rapidly  when 
the  solution  passes  through  the  tiny  region  near  soparati.x.  On  the  other  hand, 
the  residual  error  of  the  solution  remains  small  in  this  region.  In  this  case,  using 
only  the  residual,  or  indeed  the  “local  error’",  fails  completely  to  indicate  that 
the  error  of  the  solution  increases  rapidly  in  a  neighborhood  of  the  separatix. 


t  '  t 


Fig.  1.17.  Left:  Wc  plot  the  accurate  and  inaccurate  numerical  solutions  during 
the  time  that  the  inaccurate  solution  becomes  100%  inaccurate  along  with  the 
separatix.  The  inaccurate  solution  steps  on  the  wrong  side  of  the  separatix. 
Middle:  Wc  plot  the  norm  of  the  adjoint  solutions  corresponding  to  pointwisc 
error.  The  adjoint  solution  grows  exponentially  rapidly  only  when  the  solution 
passes  near  the  separatix.  Right:  Wc  plot  the  residual  for  the  inaccurate 
solution,  which  remains  small  even  when  the  error  becomes  large. 


Example  1.53  In  (Estep  ct  a/.,  2002;  Estep  and  Williams,  199G;  Estep  et  ai, 
2000),  wc  compute  the  a  posteriori  error  estimate  for  the  well-known  bistable 
(Allcn-Cahn)  problem  w  —  eAw  =  u~u^  posed  with  Neumann  boundary  condi¬ 
tions.  This  is  used  to  model  the  motion  of  domain  walls  in  a  ferromagnetic  ma¬ 
terial.  The  problem  has  two  attracting  steady  state  solutions  1  and  -1.  Generic 
solutions  eventually  converge  to  one  of  these  two  steady'  state  solutions,  but  the 
evolution  towards  a  steady  state  can  take  some  time  because  of  the  interesting 
competition  between  competing  stable  processes.,  c.g.  the  diffusion  that  tends 
to  drive  a  solution  towards  zero  and  the  reaction  that  tends  to  drive  a  solution 
towards  1  or  —1. 

In  one  dimension,  generic  solutions  form  a  pattern  of  layers  between  the  values 
of  -1  and  1,  then  solutions  undergo  long  periods  of  metastability  during  which  the 
motion  of  the  layers  “horizontally”  is  extremely  slow  punctuated  by  rapid  tran¬ 
sients  in  which  the  solution  moves  to  another  mctastablc  state  or  the  final  stable 
state.  Wc  show  the  evolution  of  numerical  approximation  of  a  mctastablc  solu¬ 
tion  with  two  mctastablc  periods  [0,44]  and  [44,144]  in  Fig.  1.18.  The  timescale 
of  mctastablc  periods  increases  exponentially  in  l/\/e  as  the  diffusion  coefficient 
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Fig.  1.18.  Left:  The  evolution  of  a  nietastable  solution  of  the  bistable  problem 
with  e  —  .0009.  Middle:  Evolution  of  the  time  and  space  residuals  on  a  uni¬ 
form  discretization.  Right:  The  evolution  of  the  absolute  adjoint  weights  for 
pointwise  errors  for  time  and  space. 


e  decreases. 

We  plot  the  space  norms  of  the  residuals  versus  time  for  numerical  solu¬ 
tions  computed  on  uniform  discretizations.  The  time  residuals  reflect  an  initial 
transient  and  then  the  two  transients  concluding  the  nietastable  periods.  The 
space  residual  simply  becomes  smaller  as  the  layers  disappear.  Finally,  we  plot 
the  space  norm  of  the  adjoint  weights  corresponding  to  the  time  and  space 
parts  of  the  error  estimate  for  the  quantities  of  interest  equal  to  pointwise  values 
of  the  solution  at  a  set  of  uniform  time  points.  The  weights  grow  in  advance  of 
the  transients  concluding  the  nietastable  periods  but  immediately  decrease  to 
1  or  smaller  right  after  the  transient,  indieating  that  accumulated  errors  have 
damped.  Thus,  while  the  effects  of  errors  grow  during  nietastable  periods,  the 
overall  error  accumulation  remains  bounded,  implying  that  accurate  long  time 
solutions  can  be  computed  provided  the  meshes  arc  sufficiently  refined. 

In  two  dimensions,  the  dynamics  of  the  problem  arc  much  different  because 
the  evolution  is  governed  by  “motion  by  mean  curvature",  meaning  that  the 
normal  velocity  of  a  transition  layer  is  proportional  to  the  sum  of  the  principle 
curvatures  of  the  layer.  Consequently,  the  time  scale  for  the  evolution  inereases 
only  at  an  algebraic  rate,  n/e,  where  k  is  the  mean  curvature,  as  the  diffusion 
coefficient  e  decreases.  We  solve  the  bistable  problem  using  initial  data  consisting 
of  two  “mesas”  corresponding  to  the  two  wells  in  the  solution  shown  in  Fig.  1.18 
using  €  =  .00003  so  that  the  evolution  occurs  over  the  same  time  scale.  We  show 
four  snapshots  of  the  solution  in  Fig.  1.19.  The  time  evolution  of  the  adjoint 
weights  show  a  pattern  of  growth  and  decay  as  for  metastablc  solutions  in  one 
dimension.  However,  solutions  in  two  dimensions  arc  much  less  sensitive  to  per¬ 
turbations  than  solutions  in  one  dimension,  and  the  adjoint  weights  are  much 
smaller  overall. 

1.4.6  General  comments  on  a  posteriori  analysis 

We  can  abstract  the  four  steps  for  a  posteriori  error  analysis  as 

1.  Identify  functionals  that  yield  the  quantities  of  interest 
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t=18  t  =  54 


I  I 


Fig.  1.19.  Left;  Four  snapshots  of  a  bistable  solution  in  two  dimensions.  Right: 
The  evolution  of  the  absolute  adjoint  weights  for  pointwise  errors  for  time 
and  spaee  along  with  a  the  weights  for  a  solution  in  one  dimension. 


2.  Define  appropriate  adjoint  problems  for  the  quantities  of  interest 

3.  Derive  a  eomputable  residual  for  eaeh  source  of  error 

4.  Derive  an  error  representation  using  a  suitable  adjoint  weights  for  eaeh 
residual 

We  also  note  that  in  general  we  have  to  account  for  all  sources  of  error  in  the 
analysis.  Typical  sources  include 

•  space  and  time  discretization  (approximation  of  the  solution  space) 

•  use  of  quadrature  to  compute  integrals  in  a  variational  formulation  (ap¬ 
proximation  of  the  differential  operator) 

•  solution  error  in  solving  any  linear  and  nonlinear  systems  of  equations 

•  model  error 

•  data  and  parameter  error 

•  operator  deeomposition 

Wo  have  not  disoussod  most  of  these  sources  in  this  note.  However,  it  is 
important  to  realize  that  different  sources  of  error  typically  accumulate  and 
propagate  at  different  rates,  and  so  must  be  accounted  for  individually  in  any 
analysis. 


1.5  A  posteriori  error  estimates  and  adaptive  mesh  refinement 

Computing  accurate  error  estimates  provides  the  tantalizing  idea  of  optimizing 
discretizations.  We  briefly  discuss  the  use  of  a  posteriori  error  estimates  for 
guiding  adaptive  mesh  refinement. 
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A  typical  goal  of  adaptive  error  control  is  to  generate  a  mesh  with  a  relatively 
small  number  of  elements  such  that  for  a  given  tolerance  TOL  and  data  Vh 

error  in  the  quantity  of  interest  =  |(e:'0)|  ^  TOL.  (L50) 

Wo  note  that  (1.50)  cannot  bo  verified  in  praetiee  because  the  error  is  unknown, 
so  we  use  an  error  estimate  and  try  to  construet  a  mesh  to  achieve 

a  posteriori  estimate  of  the  error  in  the  quantity  of  interest  ^  TOL.  (151) 

The  general  idea  is  to  write  the  estimate  as  a  sum  of  “element  contributions’"  that 
indicate  the  contribution  from  discretization  on  each  element  to  the  total  error. 
Wo  identify  the  elements  that  contribute  most  and  then  refine  those  elements. 

However,  this  simple  description  belies  a  number  of  theoretical  and  practical 
difficulties. 


1.5.1  Adaptive  mesh  refi.ncmc.nt  in  space 

We  first  consider  adaptive  mesh  refinement  for  a  stationary  problem.  In  the  case 
of  an  elliptic  problem,  we  use  the  estimate  (1.32)  to  implement  (1.51).  To  do  so, 
we  rewrite  (1.32)  as  a  sura  of  (signed)  clement  contributions, 

(e,i/))w  Y  /((/-&•  Vt/-ct/)(^-7rfc$)-aV[/  V($-7r/.$))dx.  (1.52) 


Thus  using  (1.52),  (1.51)  gives  the  goal  of  satisfying  the  following  condition:  The 
mesh  acceptance  criterion  is 


Y,  /  ((/ -  ft  •  V(/ -  c(/)($  -  Jrh4>)  -  oVC/ •  V($  -  7r,,$))  di 

Ke.Th 


<TOL. 


(1.53) 

If  the  current  appro.ximation  satisfies  (1.53),  then  the  solution  is  deemed  accept¬ 
able  and  the  refinement  process  is  stopped. 

The  difficulties  start  when  (1.53)  is  not  satisfied.  We  have  to  decide  how  to 
“enrich"  the  discretization,  e.g.,  refine  the  mesh  or  increase  the  order  of  the  ele¬ 
ment  functions,  in  order  to  improve  the  accuracy.  The  problem  is  that  generally 
there  is  a  great  deal  of  cancellation  among  the  contributions  from  each  clement. 
For  example,  consider  that  large  positive  contributions  from  one  subregion  might 
cancel  the  large  negative  contributions  from  another  region  so  that  the  sum  of 
the  contributions  from  the  two  regions  together  is  small,  sec  Ex.  1,54  below.  In 
fact;  wc  make  the  certainly  controversial  claim  that 


There  is  currently  no  theory  or  practical  method  for  accommodating  cancel¬ 
lation  of  errors  in  an  adaptive  error  control  in  a  way  that  truly  optimizes 
efficiency. 


The  standard  approach  is  to  formulate  the  discretization  enrichment  problem 
as  a  constrained  optimization  problem  after  replacing  the  error  estimate  by  an 
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error  bound  consisting  of  a  sum  over  elements  of  positive  quantities.  For  example, 
we  obtain  a  bound  from  (1.52)  by  inserting  norms  in  some  way,  c.g.,  we  use 

i(e,i/’)i<  I  \{f-b  VU-cU)(^--it,,i>)-aVU  V{^-irh^)\dx.  (1.54) 

h'eTh 

Thus,  if  (1.53)  is  not  satisfied,  then  the  mesh  is  refined  in  order  to  achieve  the 
more  conservative  criterion, 

Y  f  l(/-&- Vt/-ct/)(4>- 7r,.4>)-aVt/  - V($ |dx<TOL.  (1.55) 

A'eTfc 


The  adaptive  error  control  problem  is  the  constrained  minimization  problem 
of  finding  a  mesh  with  a  minimal  number  of  degrees  of  freedom  on  which  the 
approximation  satisfies  (1.55).  Using  the  fact  that  the  bound  in  (1.55)  is  a  sum  of 
positive  terms,  and  assuming  the  solution  is  asymptotically  accurate,  a  calculus 
of  variations  argument  yields  the  Principle  of  Equidistribution.  which  states 
that  the  solution  of  this  constrained  optimization  problem  is  achieved  when  the 
elements  contributions  arc  all  approximately  equal.  An  adaptive  mesh  algorithm 
is  a  procedure  for  solving  the  constrained  minimization  problem  associated  with 
(1.55).  If  the  Principle  of  Equidistribution  is  used,  then  the  algorithm  seeks  to 
choose  meshes  so  that  the  element  contributions  arc  approximately  equal. 

Depending  on  the  argument,  two  possible  element  acceptance  criterion  for 
the  element  indicators  arc 


max  !(/  -  6  •  VC/  -  cU){^  -  ith^)  -  aVU  ■  V($  -  7r,,<J>)| 

A 


<  I2L 
~  in|  ’ 


(1.56) 


or 


-  6  •  VC/  -  cC/)($  -  7r/.$)  -  aVC/  •  V($  -  7rh4>)|  dx 


< 


TOL 

M 


(1.57) 


where  M  is  the  number  of  elements  in  7),.  Elements  that  fail  one  of  these  tests 
arc  marked  for  refinement. 

Computing  a  mesh  using  these  criteria  is  usually  performed  by  a  “computc- 
cstimatc-niark-rcfinc”  adaptive  algorithm  that  begins  with  a  coarse  mesh  and 
then  refines  those  elements  on  which  (1.56)  respectively  (1.57)  fail  successively. 
See  (Eriksson  et  ai.  1995;  Eriksson  et  al..,  1996;  Becker  and  Rannachcr,  2001; 
Bangerth  and  Rannachcr,  2003;  Estep  et  al.,  2005;  Carey  et  ai,  20086)  for  more 
details. 

The  problem  with  any  claims  of  “optimal"  mesh  selection  is  that  gcncrically 
the  bound  (1.54)  is  typically  orders  of  magnitude  larger  than  the  estimate  (1.52). 
Example  1.54  We  illustrate  the  issue  of  the  effect  of  cancellation  of  errors  on 
the  choice  of  an  optimal  adapted  mesh  with  a  simple  computation.  Assume  that 
we  solve  an  elliptic  problem  on  a  square  domain  using  bilinear  elements  on  a  mesh 
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Fig.  1.20.  On  the  left,  \vc  display  (simulated)  signed  element  contributions.  We 
shade  four  elements  chosen  for  refinement  using  a  nonstandard  criteria.  On 
the  right,  we  display  the  corresponding  absolute  clement  contributions  and 
shade  four  elements  marked  for  refinement  by  a  standard  adaptive  algorithm. 

consisting  of  rcctangics  and  the  a  posteriori  estimate  yields  the  signed  clement 
contributions  shown  on  the  left  in  Fig.  1.20.  The  total  a  posteriori  estimate  is 

.1  +  3  X  -.0333  +  17  X  .001  +  4  x  .01  =  .0571. 

Note  that  the  large  clement  contribution  in  the  lower  left  corner  is  nearly  canceled 
by  the  contributions  of  its  throe  neighbors,  .01+3  x  —.0333  =  .0001,  so  that  region 
ends  up  contributing  relatively  little  to  the  error.  If  we  refine  in  the  upper  right 
hand  corner  by  subdividing  each  square  into  four  smaller  elements  as  indicated 
(and  assume  the  clement  contributions  decrease  by  a  factor  of  2^  =  4  without 
any  change  in  sign),  the  new  estimate  becomes 

.1  +  3  X  -.0333  +  17  X  .001  +  16  x  .01  x  x  ^  =  .0271. 

4  4 

Note  that  while  we  change  4  elements  into  16  smaller  elements,  the  element 
contribution  in  each  goes  down  by  a  factor  of  4  while  the  area  of  each  smaller 
element  is  4  times  smaller  than  its  parent. 

On  the  other  hand,  if  we  use  the  absolute  element  contributions  to  guide 
refinement,  then  the  elements  in  the  lower  left  hand  corner  arc  refined  as  shown 
on  the  left  in  Fig.  1.20,  the  new  estimate  becomes 

4x.lxixi+3x(4x  -.0333  x  i  x  i  ) +17x  .001+ 16x  .Olx-x  i  =  .057025. 
4  4V  44/  44 

There  is  almost  no  improvement  in  overall  accuracy  in  the  quantity  of  interest. 

Regardless  of  the  issue  of  dealing  with  cancellation  of  errors  efficiently,  there 
is  still  a  crucial  difference  between  adaptive  mesh  refinement  based  on  adjoint- 
weighted  residual  estimates  and  traditional  “error  indicators”  that  often  amount 
to  using  only  residuals  or  “local  errors.”  In  the  adjoint-based  approach,  the 
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element  residuals  are  sealed  by  an  adjoint  weight,  which  reflects  how  much  error 
in  that  element  affects  the  error  in  the  quantity  of  interest.  This  has  a  significant 
effect  of  mesh  refinement  patterns  in  general. 

Example  1.55  In  (Estep  et  al..  2005),  we  apply  these  ideas  to  the  adaptive 
solution  of 

f- V  •  ((.05  +  tanh(10(a:  -  5)^  +  10(y  -  l)^))Vw) 

+  ^  =  (x,y)  6  n  =  [0, 10]  X  [0,2], 

u  =  0,  (x,  y)  6  dQ 

(1.58) 

In  Fig.  1.21  we  show  the  mesh  required  to  obtain  a  numerical  solution  whose 
average  value  is  accurate  to  within  4%.  The  adaptive  pattern  is  obtained  by 
refining  from  a  coarse  uniform  mesh  using  (1.57).  Convection  causes  a  nonuniform 
pattern  of  refinement. 


Fig.  1.21.  The  mesh  used  to  solve  (1.58)  with  an  error  of  4%  in  the  average 

value  requires  24,4000  elements. 

In  the  first  computation,  the  quantity  of  interest  is  given  by  a  function  tp  that 
is  constant  over  the  entire  domain.  In  the  next  computation,  we  take  the  quantity 
of  interest  to  be  the  average  value  in  a  square  in  one  corner  of  the  domain.  We 
now  require  much  fewer  elements  to  achieve  the  desired  accuracy.  The  pattern  of 
refinement  shows  the  effects  of  the  adjoint  solution,  sec  Fig.  1.22  and  Fig.  1.25. 
In  particular,  the  adjoint  solution  decreases  rapidly  to  zero  towards  the  side  of 
the  domain  opposite  to  the  quantity  of  interest  region  and  there  is  less  dense 
mesh  refinement  along  that  side.  The  influence  of  regions  far  “upstream"  is  also 
diminished. 


Fig.  1.22.  The  quantity  of  interest  is  the  average  value  in  the  shaded  square. 
The  final  mesh  requires  7300  elements. 
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In  the  next  eomputation,  we  take  the  quantity  of  interest  to  bo  the  average 
value  in  a  square  in  middle  of  the  domain.  Again,  the  pattern  of  refinement  shows 
the  offoots  of  the  adjoint  solution,  sec  Fig.  1.23  and  Fig.  1.25.  In  particular,  the 


Fig.  1.23.  The  quantity  of  interest  is  the  average  value  in  the  shaded  square. 
The  final  mesh  requires  7300  elements. 

adaptive  mesh  refinement  makes  no  attempt  to  resolve  the  boundary  layer  at  the 
“outflow"  boundary  as  the  accuracy  there  has  no  effect  on  the  accuracy  of  the 
quantity  of  interest. 

In  the  final  computation,  we  take  the  quantity  of  interest  to  be  the  average 
value  in  a  square  in  at  the  far  end  of  the  domain.  Again,  the  pattern  of  refinement 
shows  the  effects  of  the  adjoint  solution,  sec  Fig.  1.24  and  Fig.  1.25. 


Fig.  1.24.  The  quantity  of  interest  is  the  average  value  in  the  shaded  square. 
The  final  mesh  requires  3500  elements. 

In  Fig.  1.25,  we  plot  the  solutions  of  the  adjoint  problems  corresponding  to 
the  quantities  of  interest  equal  to  average  values  in  squares  at  the  opposite  ends 
of  the  domain. 


Fig.  1.25.  The  adjoint  solutions  for  the  computations  in  Fig.  1.22  and  Fig.  1.23. 
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1.5.2  Adaptive  mesh  refinement  for  evolutionary  problems 

Traditionally,  different  approaches  for  adaptive  nicsh  algorithms  arc  used  to  han¬ 
dle  spatial  meshes  and  time  discretization.  Influenced  by  the  long  history  of  “local 
error  control’'',  the  traditional  time  algorithm  achieves  an  cqiiidistribution  of  cl¬ 
ement  contributions  by  insuring  that  the  contribution  from  each  time  interval  is 
smaller  than,  but  approximately  equal  to,  a  ’‘loeal  error  tolerance’'  LTOL  before 
proceeding  to  the  next  time  step.  Often,  LTOL  is  input  directly  without  any  at¬ 
tempt  to  relate  it  to  the  desired  tolerance  TOL  on  the  error.  Given  a  true  global 
error  estimate  however  and  the  asymptotic  accuracy  of  the  integration  scheme, 
there  arc  various  heuristic  arguments  for  determining  LTOL  in  terms  of  TOL. 

For  an  evolutionary  partial  differential  equation,  space  and  time  mesh  re¬ 
finement  strategics  have  to  be  combined  somehow.  In  the  case  of  a  parabolic 
problem,  we  distinguish  time  and  space  contributions  in  (1.49)  by  “splitting’’ 
the  projections  on  the  adjoint  solution, 


EiU)  =  ((/  -  P)uo,4>(0))  + 

j=l 

+  [  (([/,  P-F  -  $)  -f  (e(t/)Vt/.  V(P<I-  -  «>))  -  (/(P),  P$  -  $))  I 

Jo 

j=i 

+  [  {{if.,  (tr  -  1)P$)  -1-  {e{U)VU,  V((7r  -  1)P^)) 

Jo 


-(/(P),(7r-l)P$))dt 


Wc  define  bounds  on  the  time  and  space  contributions. 


^  I  r 

j=i  Ken  ' 

^  I  f 

+  E  L  /  /  (P((^-l)P^)  +  (£(t/)VP)  (V(7r-l)P4.) 

n=l  A'eTh  I  -IK 


-  f{U]{{rr  -  1)P^))  dxdt  ,  (1.60) 
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JE 


^iu)=  E  E  E  / 

KeTh  j=i  Ken  ' 

N  I  /■«"  /■ 

+  E  E  /  / ,  - 

ri=l  KETh  '  •^‘'>-1 


$)) 


-/(y)(P$-$))dxdi 


.  (1.61) 


Note  that  the  space  discretization  may  affect  the  time  contribution  and  likewise 
the  time  discretization  may  affect  the  space  contribution. 

VVe  may  now  split  the  adaptive  ntesh  problem  into  two  sub-problems,  refining 
the  space  and  time  steps  in  order  to  achieve 

TOI  TOT 

JE.4U)<-^s.ndJE.tiU)<^.  (1.62) 

On  a  given  time  interval,  this  requires  an  iteration  during  which  both  the  space 
nresh  and  tiirrc  steps  arc  refined. 


1.6  Multiscale  operator  decomposition 

We  now  turn  to  the  main  goal  of  this  chapter,  which  is  to  describe  how  the 
techniques  of  a  posteriori  error  analysis  cair  be  extended  to  multiscalc  operator 
decomposition  solutions  of  multiphysics,  multiscalc  problems.  Recall  that  general 
approach  is  to  decompose  the  multiphysics  problem  into  components  involving 
simpler  physics  over  a  relatively  limited  range  of  scales,  and  then  to  seek  the 
solution  of  the  entire  system  through  some  sort  of  iterative  procedure  involving 
solutions  of  the  individual  components. 

While  the  particulars  of  the  analysis  vary  considerably  with  the  problem, 
there  arc  several  key  ideas  underlying  a  general  approach  to  treat  operator  de¬ 
composition  multiscalc  methods,  including-. 

•  We  identify  auxiliary  quantities  of  interest  associated  with  information 
passed  between  physical  components  and  solve  auxiliary  adjoint  problc;ms 
to  estimate  the  error  in  those  quantities. 

•  We  deal  with  scale  differences  by  introducing  projections  between  discrete 
spaces  used-  for  component  solutions  and  estimate  the  effects  of  those  pro¬ 
jections. 

•  The  standard  linearization  argument  used  to  define  an  adjoint  operator 
associated  with  error  analysis  for  a  nonlinear  problem  may  fail,  requiring 
another  approach  to  define  adjoint  operators. 

•  In  this  regard,  the  adjoint  operator  associated  with  a  multiscalc  operator 
dccomposition  solution  method  is  often  different  than  the  adjoint  associ¬ 
ated  with  the  original  problem,  and  the  difference  may  have  a  significant 
impact  on  the  stability  of  the  method. 
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•  In  practice,  solving  the  adjoint  associated  with  the  original  fully-coupled 
problem  may  present  the  same  kinds  of  multiphysics,  multiscalc  challenges 
posed  by  the  original  problem,  so  attention  must  be  paid  to  the  solution 
of  the  adjoint  problem. 

We  explain  these  ideas  in  the  context  of  three  examples. 

l.G.l  Multiscale  decomposition  of  triangular  systems  of  elliptic  problems 

Following  (Carey  et  ol.,  2006),  we  can  capture  the  essential  features  of  the  ther¬ 
mal  actuator  model  described  in  Example  1.1  using  a  two  component  “one-way” 
coupled  system  of  the  form 

!— V  •  oiVui  -f  bi  ■  Vui  +  cjui  =  ft(x).  X  e  Q 

— V  ■a2Vu2 +1)2  ■VU2  +  C2U2  =  f2(x,ui,  Dili),  xefl,  (1.63) 

ui  =  U2  =  0,  K  e  dQ, 

where  a;,  bt,  Ci,  fi  are  smooth  functions,  with  ai ,  02  >  o  >  0  on  a  bounded  domain 
n  in  with  boundary  dD,  and  a  is  a  constant.  Note  that  the  problems  arc 
coupled  through  f2-  The  “lower  triangular”  form  of  this  system  means  that  we 
can  cither  solve  it  as  a  coupled  system  or  we  can  solve  the  first  equation  and  then 
use  the  solution  to  generate  the  paiamctcrs  for  the  second  problem.  The  latter 
approach  fits  the  idea  of  a  multiscalc,  operator  decomposition  discretization. 

The  weak  form  of  the  first  component  of  (1.63)  reads;  find  uj  6  W^{Vl) 
satisfying 

-4i(ki,ui)  =  (/i,ui),  for  all  vi  6  H^{fl),  (1.64) 

where 

'4i(ui,ui)  =  (ttiVui,  Vui)  -1-  (bi(x)  ■  Vui.vi)  +  {ciUi.vi) 

is  a  bilinear  form  on  Q  and  Hq{Q)  is  thcsubspacc  of  functions  in  H^{Q)  that  arc 
zero  on  dfl.  Likewise  the  weak  formulation  of  the  second  component  of  (1.63) 
reads:  find  U2  &  /?o(fl)  satisfying 

A2{u2,V2)  =  {f2{x,ui,Dui),V2),  for  all  V2  6  Ho{Q),  (1.65) 

where 

A2{u2,V2)  S  (a2Vu2i  VU2)  +  (<>2(2:)  '  VU2,U2)  +  {C2U2,V2), 

is  another  bilinear  form  on  fl. 

We  introduce  the  finite  element  space  5h,i(fl)  C  Hq{Q),  corresponding  to  a 
discretization  of  Q  for  the  first  component,  and  another  finite  clement  space 
on  a  different  mesh  7h_2,  for  the  second  component.  Using  different 
finite  clement  spaces  for  different  components  in  a  system  of  equations  raises  a 
serious  practical  difficulty.  Namely,  evaluating  integrals  defining  finite  clement 
approximate  solutions  involve  functions  from  different  spaces  is  problematic.  In 
practice,  quadrature  formulas  arc  used  to  approximate  the  integrals  defining 
a  finite  clement  function.  This  raises  a  potential  difficulty  because  quadrature 
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formulas  work  best  when  the  integrands  are  smooth,  whereas  the  standard  finite 
element  funetions  arc  only  continuous.  We  avoid  potential  difficulties  by  writing 
any  integrals  as  a  sum  of  integrals  over  elements, 

/  integrand  di  =  ^  /  integrand  dx, 

Ju  Jk 

and  applying  quadrature  formulas  on  each  clement  on  which  the  finite  clement 
functions  arc  smooth.  However,  in  the  case  of  a  system  in  which  the  components 
arc  solved  in  different  finite  clement  spaces,  it  is  not  so  straightforward  to  apply 
quadrature  formulas  to  evaluate  integrals.  A  function  in  one  finite  clement  space 
may  only  be  continuous  on  an  element  associated  with  another  finite  clement 
space.  To  avoid  this  problem,  we  introduce  projections  n;_j  from  to  Suj, 
c.g.  intcrpolants  or  an  orthogonal  projection.  We  apply  these  projections 
before  applying  quadrature  formulas. 


Algorithm  2  Multiscalc  Operator  Decomposition  for  Triangular  Systems  of 
Elliptic  Equations 

Construct  discretizations  T/i,i,7),  2  and  corresponding  spaces  Sh,i,Sh,2 
Compute  Ui  €  <S/,^i(n)  satisfying 

■Ai(Ui.vi)  =  (f\.,vi),  for  all  uj  €  (1-66) 

Compute  U2  €  <S/i,2(n)  satisfying 


ti2)  —  if2i^-,  rfl— -2^^ I  •.  ni_42Dt/ 1 )  ,  'i;2)T  fo''  ^2  €  <5/1,2 (f^)-  (f '67) 


We  observe  that  any  errors  made  in  the  solution  of  the  first  component  affect 
the  solution  of  the  second  component.  This  turns  out  to  be  a  crucial  fact  for  a 
posteriori  error  analysis. 

Example  1.56  In  (Carey  et  al.,  2006),  vve  solve  a  system 

— Awi  =  sin(47rx)  sin(7ry),  lefl  ^ 

'  —Au2  =  b  ■  Vui  =0.  X  €  n.  h  =  — 

It 

ui  =  U2  =  0,  X  €  dQ. 

using  a  standard  piecewise  linear,  continuous  finite  element  method,  where  fl  = 
[0, 1}  X  [0, 1],  in  order  to  compute  the  quantity  of  interest 

U2(.25,.25). 

VVe  solve  for  ui. first  and  then  solve  for  «2  using  independent  meshes  and  show 
the  solutions  in  Fig.  1.26. 

Using  uniform  meshes,  evaluating  the  standard  a  posteriori  error  estimate 
for  the  second  component  problem,  ignoring  any  effect  arising  from  error  in  the 
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Fig.  1.26.  Solutions  of  the  component  problems  of  (1.68)  computed  on  uniform 

meshes. 

solution  of  the  first  component,  yields  an  estimate  of  component  solution  error 
to  be  as  .0042.  However,  the  true  error  is  «  .0048  and  there  is  discrepancy  of 
w  .0006  (~  13%)  in  the  estimate.  This  is  a  consequence  of  ignoring  the  transfer 
error. 

If  we  adapt  the  mesh  for  the  solution  of  the  second  component  based  on  the 
a  posteriori  error  estimate  of  the  error  in  that  component  while  neglecting  the 
effects  of  the  decomposition,  sec  Fig.  1.27,  the  discrepancy  becomes  alarmingly 
worse.  For  example,  we  can  refine  the  mesh  until  the  estimate  of  the  error  in  the 
second  component  is  «  .0001.  But,  we  find  that  the  true  error  is  «  .2244! 


Fig.  1.27.  Results  obtained  after  refining  the  mesh  for  the  second  component 
so  that  the  a  posteriori  error  estimate  of  the  error  only  in  the  second  com¬ 
ponent  is  less  than  .001.  The  mesh  for  the  first  component  remains  coarse, 
consequently  the  error  in  the  first  component  becomes  relatively  much  larger. 


1.6.1 .1  A  linear  algebra  example  We  can  describe  some  of  the  essential  features 
of  the  analysis  using  a  block  lower  triangular  linear  system  of  equations.  Wo 
consider  the  system 
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with  approximate  solution 

Wc  estimate  the  error  in  a  primary  quantity  of  interest  involving  only  U2, 

[ip^^Ku]  =  (V'2‘’:«2)  where  V  =  ^^0))  • 

We  require  a  superscript  (1)  since  we  later  define  an  auxiliary  quantity  of  inter¬ 
est. The  lower  triangular  structure  of  the  system  matrix  yields  residuals 

R\  =b\  —  Aiif/i . 

R2  =  (62  ~  A2lf/l)  —  A22U2  ■ 


Note  that  the  residual  R2  of  the  approximate  solution  of  the  second  component 
depends  upon  the  solution  of  the  first  component,  and  any  attempt  to  decrease 
this  residual  requires  a  consideration  of  the  accuracy  of  Ui.  The  adjoint  problem 
to  (1.69)  is 


and  the  resulting  error  representation  is 


=  {Tp2^\e2)  =  (Aj24‘’,e2) 

=  (4‘’-A22U2)  -(4‘’=A22f/2) 

=  (02**:  ^2  ~  A21W1)  —  (02**:  A22l^2)  (1-70) 

=  (02**:  &2  —  A21U1  —  A22U2)  +  (02*',  A2iei) 

=  (02*', /?2)  +  (02**1  A2iei). 


The  first  term  of  the  error  representation  requires  only  U2  and  02'*-  Since  the 
adjoint  system  is  upper  triangular  and 

4**  =  (aJ2)~'’^^’2** 


is  independent  of  the  first  component,  the  calculation  of  (02*',  R2)  remains  within 
the  “single  physics  paradigm”,  that  is  wc  solve  the  adjoint  problem  using  indi¬ 
vidual  component  solves  rather  than  forming  and  solving  a  global  problem.  The 
second  term  (02*',A2iei)  represents  the  effect  of  errors  in  U\  on  the  solution 
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U2-  At  first  glance  this  term  is  uncompiitablc.  but  wc  note  that  it  is  a  linear 
functional  of  ei  since 

AiiCi)  =  (Aj|(^2'^.ei). 

Wc  therefore  form  the  adjoint  problem  for  the  transfer  error. 


The  upper  triangular  bloek  stnietiirc  of  immediately  yields  =  0.  As 
noted  earlier,  error  estimates  of  Ui  should  be  independent  of  112  -  Thus,  = 

1/1  =  AJi(^2*'i  so  that  onee  again  wc  can  solve  for  in  the  “single  physics 

paradigm.’'  Given  obtain  the  auxiliary  error  representation 

=  (Aj,4'>,e,)  =  (1-71) 

Combining  the  first  term  of  (1.70)  with  (1.71)  yields  the  complete  error  repre¬ 
sentation 

which  is  a  sum  of  the  inner  products  of  “single  physics”  residuals  and  adjoint 
solutions  computed  using  the  “single  physics"  paradigm. 

Example  1.57  Wc  consider  an  101  x  101  system  with 
An  =  /  +  .2  X  random  matrix, 

A21  =  random  matrix, 

A22  =  /  -h  .1  X  random  matrix, 

where  the  eocfficicnts  in  the  random  matrices  are  f/(— 1, 1).  The  righthand  side 
is  also  a  random  vector  with  (/(—1, 1)  coefficients.  Wc  solve  the  linear  systems 
using  the  Gauss-Scidcl  iteration  with  varying  numbers  of  iterations,  so  that  wc 
have  control  over  the  accuracy  of  the  solutions.  The  quantity  of  interest  is  the 
average  of  the  coeffieients  of  U2. 

In  the  first  computation,  we  solve  the  first  component  Aut/i  =  61  using 
20  iterations.  The  error  in  the  resulting  solution  is  ~  .031.  Whcii  we  solve  the 
second  component  A22U2  =  62  —  A21I/1  using  40  or  more  iterations,  we  find  an 
error  in  the  quantity  of  interest  of  a:  3.9  x  10““*.  Wc  cannot  improve  the  accuracy 
of  the  second  component  regardless  of  how  many  iterations  wc  use  beyond  40. 
Solving  the  adjoint  problems,  we  find 

«3.86x  10"^ 

~3.1  X  10“®, 

r;3.86x  10““. 

The  error  in  the  quantity  of  interest  is  almost  entirely  due  to  the  error  in  the 
solution  of  the  first  component. 
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In  the  second  computation,  we  solve  the  first  component  AnUi  =  bi  using 
200  iterations.  The  error  in  the  resulting  solution  is  r;  10”®.  When  we  solve  the 
second  component  A22U2  =  62  —  A2iU\  using  40  or  more  iterations,  we  find  an 
error  in  the  quantity  of  interest  of  ss  10”^.  This  confirms  that  the  error  in  the 
solution  of  the  second  component  in  the  first  computation  is  dominated  by  the 
error  in  the  solution  of  the  first  component. 

1.6. 1.2  Description  of  the  a  posteriori  analysis  We  seek  the  error  62  in  a  quan¬ 
tity  of  interest  given  by  a  functional  of  U2,  noting  that  that  a  quantity  of  interest 
involving  ui  can  be  computed  without  solving  for  U2.  Since  we  introduce  some 
additional,  auxiliary  quantities  of  interest,  we  denote  the  primary  quantity  of 
interest  by  {iii^\.e2).  VVe  use  the  adjoint  operators 

Vu,)  -  (div(6,0,),n,)  +  (ci^i.Hi) 

Al((p2,V2)  =  (a2V02;  Vt;2)  -  (div{b2(p2),V2)  +  (02^2, n2). 


We  also  use  the  linearization 

^'/2(ui)(ui  -  17i)  =  J  Z7i(l -s))  ds. 

Noting  that  the  solution  of  the  first  adjoint  component  is  not  needed  to 
compute  the  quantity  of  interest  e)  =  (i!’2^\e2),'^e  define  the  primary  adjoint 
problem  to  be 


^2(4‘’>«2)  =  for  all  V2  6 

The  standard  argument  yields  the  error  representation, 

{%jj,e)  =  (Vi^‘’,e2)  =  (/2(x.it|,i?ui),4'’)  -  A2{U2:<p2^).  (1.72) 

To  simplify  notation,  we  denote  the  weak  residual  of  each  solution  component 
by 

i  =  1,2, 

so  (1.72)  becomes 

(r/,,e)  =7^2(^/2,</>^‘’:WI), 

At  this  point,  it  is  not  clear  that  (172)  is  computationally  useful  since  the 
residual  on  the  righthand  side  of  (1.72)  involves  the  unknown  true  solution  Ui. 
One  consequence  is  that  we  cannot  immediately  use  Galcrkin  orthogonality  by 
inserting  a  projection  of  0^*1  into  the  representation,  since  Galcrkin  orthogonality 
for  U2  holds  for  residual  7L2{U2,<i>^2^ ’,U\)  not  'R-2{U2,<t>^2^ 
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To  deal  with  this,  wc  add  and  subtract  (/2(x,  i/i,  Dt/i),  02*')  'i*  (1-72)  and, 
assuming  the  same  meshes  arc  used  for  U\  and  U2-,  use  Galcrkin  orthogonality 
to  obtain 


(V'-.e)  =  TC2(t/2,(/  -  n2)4*’;f/i)  +  (/2(a:,u,,D«,)  -  /2(x,  C/, ,  DC/,), 

(1.73) 

where  02  is  a  projection  into  the  finite  clcuicnt  space  for  D2.  The  first  term  on  the 
right  of  (1.73)  is  the  standard  a  posteriori  error  expression  for  the  second  com¬ 
ponent  wliilc  the  remaining  difference  represents  the  transfer  error  that  arises 
from  using  an  approximation  of  w,  in  defining  the  coefficients  in  the  equation  for 
U2.  The  goal  now  is  to  estimate  this  transfer  error  and  its  effect  on  the  quantity 
of  interest. 

VVe  recognize  that  the  transfer  error  is  a  (nominally  nonlinear)  functional  of 
the  error  in  u, ,  defining  an  auxiliary  quantity  of  interest.  Wc  approximate  it  by 
a  linear  functional, 

{f2{x,Uu  Dui)  -  h{x,Uu  DUt),4'^)  -  {Df2iU,)  X  eu4'^)  =  (ci,’/'!'’)- 

Wc  define  the  corresponding  transfer  error  adjoint  problem 

A{4>^x\vt)  =  (V'P\v,)  for  all  u,  6  VK2'(f2),  (1.74) 

noting  that  as  for  the  primary  problem,  wc  do  not  have  to  solve  the  second 
component  of  the  full  adjoint  problem.  The  transfer  error  representation  formula 
is  given  by 

=  (/i,{/-n,)4if>)  ->i,{c/,,(/-n,)<^'2>), 

where  fl,  is  a  projection  into  the  finite  clement  space  of  U\.  Wc  obtain  the  error 
representation, 

(^,e)  =7i2(D2,(/-n2)4‘’;t/i)+7^i(f/i,(/-n,)0','’).  (1.75) 

In  the  final  step,  wc  account  for  the  error  induced  by  using  a  multiscalc 
discretization,  i.c.  different  meshes  for  D,  and  U2-  Example  1.56  shows  that  this 
can  have  a  significant  effect  on  overall  accuracy. 

One  issue  is  that  wc  use 

/2(x,  ni_2Di,ni_2DDi) 

in  the  equations  defining  the  finite  clement  approximation.  Correspondingly,  wc 
alter  the  definition  of  the  residual 

'^2{U2,X',i')  =  (/2(n,^2‘^).X)  - -42(1/2,  x). 

In  addition,  wc  use  projections  to  treat  any  integral  involving  functions  that 
arc  defined  on  both  discretizations,  i.e.  functions  of  D,-  and  0;,  i  =  1,2.  After 
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decomposing  the  original  estimate  to  account  for  all  the  projections,  the  new 
error  representation  formula  for  the  transfer  error  becomes 

[Dh{Ui)  X  ei,n2_,4'>)  +  (D/2(t/i)  xe,..(/-n2^,)4'’) 

which  is  the  error  contribution  arising  from  the  transfer  as  well  as  an  additional 
term  that  is  large  when  the  approximation  spaces  arc  significantly  different. 

The  data  defining  the  transfer  error  adjoint  is  now 

(/2(Wl)  -  /2(f/.),4‘^)  {Dl2{Ux)  X  C,,n2_,4'>)  =  (iAp’,ei). 

The  additional  term  {DJ2{U\)  x  Ci. (/  —  n2_i)<;!>2'*)  'S  a  linear  functional,  so  we 
define  an  additional  auxiliaiy  quantity  of  interest 

=  (D/2(f/,)xe,,(/-n2-.,)^&^'>) 

and  the  corresponding  adjoint  problem 

for  all  u,  6  VV2'(fi).  (1.7C) 

The  final  error  representation  is  therefore  (Carey  et  ai,  200C) 

Theorem  1.58 

(tA..e)  =  n2iU2.il-  n2)4’>;t/,)  +  7^,(t/,.(/  - 

+  (n,_2/2(f/i)  -  /2(n,_2f/i)..4*’)  +  ((/  -  n,_2)/2(t/i).<^2”)- 

(1.77) 

We  emphasize  that  evaluating  the  integrals  in  (1.77)  is  far  from  trivial.  We 
have  used  Monte-Carlo  techniques  with  good  results,  see  (Carey  et  ai,  2006). 

Example  1.59  In  Example  1.5G,  we  estimate  the  contributions  to  the  error 
reported  in  that  computation  using  the  relevant  portions  of  (1.77).  To  produce 
the  adaptive  mesh  results  shown  in  Fig.  1.27,  we  construct  the  adapted  mesh 
using  cquidistribution  based  on  a  bound  derived  from  the  first  term  in  (1.77), 
i.c.  neglecting  the  terms  that  estimate  the  transfer  error. 

Instead,  we  consider  the  system  (1.68)  for  the  quantity  of  interest  equal  to 
the  average  value  of  t/j.  We  begin  with  the  same  initial  coarse  meshes  as  in 
Fig.  1.27,  but  add  the  transfer  error  c.xprcssion  to  the  mesh  refinement  criterion. 
Adapting  the  mesh  so  that  the  total  error  in  the  quantity  of  interest  for  U2  has 
error  estimates  less  than  10”“*  yields  the  meshes  shown  in  Fig.  1.28.  We  see  that 
the  first  component  solve  requires  significantly  more  refinement  than  the  second 
component. 

Example  1.60  This  example  shows  that  differences  in  mesh  discretization  scale 
between  the  two  components  can  contribute  significantly  to  the  error.  We  again 
solve  (1.08)  for  the  quantity  of  interest  equal  to  f/2(15, 15).  We  begin  with  iden¬ 
tical  coarse  meshes  for  the  two  components,  but  refine  only  the  mesh  for  1/2-  We 
solve  the  primary  adjoint  problem  as  well  as  the  two  auxiliary  adjoint  problems 
and  show  the  results  in  Table  1.3. 


Table  1.3.  Error  contributions  when  there  is  a  scale  difference  in  the  meshes. 

1.G.2  Multiscale  decomposition  of  reaction- diffusion  problems 
We  follow  the  presentation  in  (Estep  et  ai,  2008a).  In  the  introduction;  we  pre¬ 
sented  operator  splitting  Alg.  1  for  reaction-diffusion  problems  as  a  classic  exam¬ 
ple  of  multiscalc  operator  decomposition.  Upon  discretizing  a  reaction-diffusion 
equation  (1.4.4)  in  space  using  a  standard  piecewise  linear,  continuous  finite  el¬ 
ement  method  as  described  in  See.  1.4.1  we  obtain  a  (high  dimensional)  initial 
value  problem  of  the  form  (1.7).  We  can  then  apply  the  operator  splitting  al¬ 
gorithm  Alg.  1.  Finally,  we  discretize  the  component  problems  of  the  operator 
splitting  method  using  the  either  the  dG  or  cG  methods  on  the  independent  time 
discretizations  {in},  {Sm,n}  dcseribed  in  Fig.  1.4. 

For  example,  if  we  use  the  dG  methods  for  both  components,  then  the  finite 
element  approximate  solutions  arc  sought  in  a  piecewise  polynomial  spaces  for 
the  diffusion  and  reaction  components  respectively, 

yM  =  1  <  n  <  Af}  . 

V‘‘'’'’(/„)  =  [U  ■.  e  P‘’^>(/m,n),  1  <  m  <  M,,}  , 

for  n  —  1,***  ,  A^,  and  In  —  [^n— i,^nj  and  Im^n  —  [®m— (.fri)  de¬ 
notes  the  space  of  polynomials  in  R*  of  degree  qa  on  In-  A  similar  definition  holds 
for  'P^’''*(/m,n).  We  let  U^'~  denote  the  left-  and  right-hand  limits  of  U  at 
and  [17], 1  =  Un  —  17,7  the  jump  '^alue  of  U  at  t„. 

Let  Y(t)  be  the  piecewise  continuous  finite  element  approximation  of  the 
operator  splitting  with 
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The  nodal  values  ^^re  obtained  from  the  following  prooeduro: 

Algorithm  3  Multiseale  Operator  Splitting  for  Reaetion-Diffusion  Equations 

Sot  Ko  =  2/0 
for  n  =  1 ,  •  •  •  .  A  do 
Set  r'o-,„  =  f.-i 
for  m  =  1,  •  •  •  ,  Mn  do 

Compute  „  G  satisfying 


^  (rc  it)  dt  +  iT+_,)  =  ^  (F(r'  ),  VK)  dt 

for  all  (1.78) 

end  for 
Set 

Compute  €  V^'^''\ln)  satisfying 


^  {X>^,V)dt+{[YX.u 

Set  Vn  =y'‘{t-) 
end  for 


^n-i)=l  {AY^,V} 


dt  for  all  Ver^‘'A{I„) 
(1.79) 


Adapting  standard  convergonoo  analysis  techniques,  we  can  show  that  if  / 
is  Lipschitz  continuous,  then  for  q,/  =  0, 1  and  qr  =  0, 1,  there  exists  constants 
C] ,  C2,  C3  such  that, 

|yiv  -  VWI  <  C,  At  +  CaAt’-'+i  +  C3As’^+'. 

In  Ex.  1.5,  wo  present  an  example  in  whioh  multiseale  operator  dooomposition 
affoots  the  stability,  and  honoo  aoouraoy,  of  the  solution.  Suoh  affeots  oan  take  a 
myriad  of  forms. 

Example  1.61  In  (Estop  et  ai.  2008a),  wo  illustrate  the  instability  of  operator 
splitting  applied  to  the  Brussolator  problem  (1.4).  Wo  apply  a  standard  first 
order  splitting  sohomc  to  a  spaoo  disorotization  of  the  Brussolator  model  with 
500  disoroto  points  with  a  =  .6,  /?  =  2,  /ci  =  ^2  =  .025  oonsisting  of  the  oG(l) 
method  for  the  diffusion  with  time  stop  of  .2  and  dG(0)  method  for  the  roaotion 
with  time  stop  of  .004.  On  the  loft  of  Fig.  1.29,  wo  show  a  numorioal  solution 
that  exhibits  ,  nonphysioal  osoillations  that  developed  after  some  time.  On  the 
right,  wo  show  plots  of  the  error  versus  time  stops  at  different  times.  There  is  a 
oritical  time  step  above  whioh  the  instability  develops.  Moreover,  ohanging  the 
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Time  Step  Size 


Fig.  1.29.  The  Icfthand  plot  illustrates  typical  instability  that  can  arise  from 
inultiscalc  operator  splitting  applied  to  Brussclator  problem.  Solution  is 
shown  at  time  80.  On  the  right,  we  show  plots  of  the  error  in  the  L2  norm 
versus  time  step  size  at  different  times. 

space  discretization  docs  not  improve  the  accuracy.  In  fact,  using  a  finer  spatial 
discretization  for  a  constant  time  step  size  leads  to  significantly  more  error  in 
the  long  time  solution,  sec  (Ropp  and  Shadid,  2005). 

The  goal  is  to  derive  an  accurate  a  posteriori  estimate  of  the  error  in  a 
specified  quantity  of  interest  computed  from  a  inultiscalc  operator  splitting  ap¬ 
proximate  solution  of  (1.7).  The  estimate  must  account  for  the  stability  effects 
arising  from  operator  splitting.  In  the  analysis,  wo  distinguish  the  effects  of  op¬ 
erator  splitting  from  the  effects  of  numerical  discretization  of  the  components. 
The  operator  splitting  discretization  Alg.  3  is  a  consistent  discretization  of  the 
formal  “analytic”  operator  splitting  Alg.  1  and  the  numerical  error  arising  in 
each  component  can  be  treated  with  the  standard  a  posteriori  analysis  discussed 
previously.  Estimating  the  error  arising  from  the  operator  splitting  itself  requires 
a  new  approach. 

A  main  technical  issue  is  the  definition  of  a  suitable  adjoint  problem  because 
the  standard  approach  used  for  nonlinear  problems  described  in  Sec.  1.4.3  fails. 
Indeed,  the  adjoint  operator  corresponding  to  the  solution  operator  for  an  oper¬ 
ator  decomposition  discretization  is  typically  different  than  the  adjoint  operator 
associated  with  the  true  solution  operator.  Because  an  adjoint  problem  carries 
the  global  stability  information  about  the  quantity  of  interest,  accounting  for  the 
differences  between  adjoint  problems  associated  with  the  original  problem  and  a 
numerical  discretization  are  critical  for  obtaining  accurate  error  estimates. 

In  the  estimate  described  below,  this  difference  takes  the  form  of  “residuals" 
between  certain  adjoint  operators  associated  with  the  fully  coupled  problem  and 
an  analytic  operator  split  version.  A  practical  difficulty  with  such  a  result  is  that 
solving  the  adjoint  for  the  fully  coupled  problem  poses  the  same  multiphysics 
challenges  as  solving  the  original  forward  problem.  We  therefore  develop  a  now 
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hybrid  a  priori  -  a  posteriori  estimate  that  combines  a  computable  leading  order 
expression  obtained  using  a  posteriori  arguments  with  a  provably  higher  order 
bound  obtained  using  a  priori  convergence  result. 

l.G.2.1  A  linear  algebra  example  VVe  illustrate  the  ideas  of  the  analysis  in  the 
context  of  solving  a  linear  system 


My  =  b,  (1.80) 

where  M  is  a  n  x  n  matrix  of  the  form 

M  =  I-e(A  +  B), 

with  I  denoting  the  identity  matrix,  A  and  B  denoting  n  x  n  matrices,  and  e 
denotes  a  small  parameter.  By  standard  results,  M  is  invertible  for  all  sufficiently 
small  e. 

The  analog  of  an  operator  splitting  for  (1.80)  is  the  problem 

Ny  =  6,  (1.81) 


where 

N  =  (I-eA)(I-eB) 

is  also  invertible  for  e  small.  This  follows  from  the  observation 

=  (I-e(A  +  B))~‘,  N-'  =  (I-6B)''(I-£A)”', 

so  that  inverting  N  involves  inverting  operators  involving  only  A  and  B  individ¬ 
ually.  Using  the  Neumann  scries  to  represent  the  ihversc  operators  leads  to  the 
estimate, 

M-‘  -  N-‘  =  I  -I-  6(A  +  B)  -h  e^(A  -I-  B)^  -h  e*(A  -|-B)^-|--’- 

-  (I  -I-  eB  -I-  e^B^  -I-  e^B^  H - )  x  (l  -1-  eA  +  e^A^  -I-  e^A^  -I - ) 

=  e^BA -I- 0(6^). 

(1.82) 

We  consider  the  problem  of  solving  (1.80)  to  compute  the  quantity  of  interest 
{y.tp).  We  have  the  associated  adjoint  problems 

M*d)  =  Ip,  N*^  =  Ip. 

U  Y  y  is  a  numerical  solution,  the  standard  a  posteriori  analysis  in  Ex.  1.41 
gives 

{e,iP)  =  {Y -y,ip)  =  {^,R),  R  =  'NY-b. 

Since  we  assume  that  problems  involving  N  arc  solvable,  this  is  a  computable 
estimate.  However,  the  error  we  wish  to  estimate  is  {Y  —  y,xp).  We  write  this  as 

{Y -y,ip)  =  {Y  -  y.ip)  +  (ij  -  y.^p)  =  R)  +  {y  -  y,ip). 
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The  second  term  on  the  right  is  the  error  arising  from  the  operator  splitting. 
Since  these  problems  are  linear,  we  can  use  the  Greens  function  formulas 


to  conclude 

{Y-y..i,}  =  i^.,R)  +  i^-4>,b).  (1.83) 

Example  1.62  We  let  A  and  B  be  random  500  x  500  matrices,  where  the 
coefficients  in  the  random  matrices  are  f/(— 1. 1)..  normalized  in  the  2-norm  and 
e  =  .01.  Wo  set  y  to  be  a  random  vector,  with  f/(— 1,1)  coefficients.,  and  set 
b  =  My.  which  insures  that  j/  is  a  solution  within  maebino  precision.  Finally,  we 
set  Ip  =  (1,0,  ■  ■  )^.  We  compute  Y  using  Gaussian  elimination.  We  find 

(y-y,'0)  fs  5.376  X  10"^ 

{$,R)  ^  1.221  X  10“'=^ 

{i-(p,  6)  R5  5.376  X  10"®. 

This  means  that  nearly  all  of  the  error  is  captured  by  the  effect  of  operator 
splitting  on  the  adjoint  solution. 

As  noted  above,  (1.83)  is  problematic  because  it  requires  the  solution  of  the 
“true"  adjoint  problem,  which  is  unavailable  in  the  operator  splitting  paradigm. 

1.6.2. 2  Description  of  the  hybrid  a  posteriori-a  priori  error  analysis  We  now 
describe  an  error  estimate  for  a  multiscale  operator  decomposition  solution  of 
(1.7)  that  is  composed  of  a  leading  expression  is  a  posteriori  and  an  a  priori 
expression  that  is  provably  higher  order.  Sec  (Estep  et  al,  2008o)  for  the  full 
analysis. 

We  begin  with  the  decomposition 

Y-y={Y-y)  +  (y-y),  (1.84) 

where  y  solves  (1.7),  y  is  computed  via  the  abstract  operator  splitting  Alg.  1, 
and  Y  is  computed  via  the  numerical  operator  splitting  Alg.  3. 

The  first  expression  on  the  right  of  (1.84)  is  the  error  of  T  as  a  solution 
of  the  operator  split  problem.  Note  that  T  is  a  consistent  numerical  solution 
for  the  analytic  operator  split  problem  and  the  expression  for  its  error  can  be 
estimated  using  the  standard  a  posteriori  error  analysis.  We  let  define  the 
adjoint  solution  associated  with  the  diffusion  component  (1.79)  satisfying 

=  t„>t>tn-l.. 

Furthermore,  we  let  i)''  define  the  adjoint  solution  associated  with  the  reaction 
component  (1.78)  satisfying 
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I -i5'-  =  S„.,„  >  i 

1^15'  {S„t,n)  —  V’m.n: 

for  m  =  M„,  •  •  •  ,1,  with  =  i9'C-i  and  V’'',,.,,,  =  for  <  A/„.  Thus 

)9''  is  continuous  across  the  internal  reaction  time  nodes  s,„,n:  m  =  1,  •  •  •  ,  A/„  —  1. 
Here, 

F'{y'\Y')=  f  F'(sy’-  +  (l-s)Y^)ds. 

Jo 

The  second  expression  on  the  right  of  (1.84)  is  an  abstract  error  of  operator 
splitting.  Following  the  analysis  for  the  linear  algebra  example,  we  use  analogs  of 
the  classic  representation  formula  involving  the  Greens  function  of  a  linear  ellip¬ 
tic  problem  to  eonstruct  an  estimate.  The  nonlinearity  eomplicates  the  analysis 
however  beeause  we  have  to  use  linearization  to  define  unique  adjoint  problems, 
which  raises  the  issue  of  choosing  a  trajcctorj'  around  which  to  linearize.  We  can¬ 
not  use  the  standard  approach  of  linearizing  the  error  representation  dcscriljcd 
in  See.  1.4.3  because  of  the  operator  splitting.  Instead,  we  assume  that  both  the 
original  problem  and  the  operator  split  version  have  a  common  solution  and  we 
linearize  each  problem  in  a  neighborhood  of  this  common  solution.  For  example, 
we  assume  that  j/  =  0  is  a  steady  state  solution  of  both  problems,  which  can  be 
achieved  by  assuming  that 

Homogeneity  Assumption:  F(0)  =  0, 

and  we  linearize  in  a  region  around  0.  In  terms  of  applications  to  reaction- 
diffusion  problems,  there  arc  mathematical  reasons  for  making  the  homogeneity 
assumption  and  it  is  satisfied  in  a  great  many  applications.  However,  we  can  mod¬ 
ify  the  analysis  to  allow  for  linearization  around  any  known  common  solution, 
see  (Estep  et  ai,  2008a). 

To  motivate  this  definition,  we  derive  an  abstract  Greens  function  represen¬ 
tation.  On  time  interval  (t,i_i,t„),  we  consider  the  linearized  problem. 


y  =  Ay{t]  +  F'(y)y(t), 


where  ^ 

F'{y)  -  f  F’(sy)ds. 

Jo 

We  note  that  F'{y)y  =  F{y)  because  F(0)  =  0.  The  generalized  Greens  function 
ip  satisfies  the  adjoint  problem 


-ip  =  A'^ip(t)  +  F'(y)^(p(t).  tn>t>  ta-l, 

=  V'-.: 


(1.85) 


where  V’fi  determines  the  quantity  of  interest  (y(tn),ipn);  and  and  F'{y) 
denote  the  transpose  of  A  and  F'{y);  respectively.  We  choose  ipn  =  <p{tt  )•■  which 
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couples  the  local  adjoint  problems  (1.85)  to  form  a  global  adjoint  problem.  This 
definition  yields  a  simple  representation  of  the  solution  value  over  one  time  step 

{yn,'>Pn)=  n=\,2,---  ,N  (l/iV,  V-iv)  =  (l/O,  (1.86) 

We  use  analogs  for  (1.86)  for  solutions  of  cacli  component  in  the  operator 
splitting  discretization.  For  n  =  L  •  •  •  .N.wc  define  the  three  adjoint  problems. 
The  diffusion  problem  is  simpler  because  it  is  linear, 

It  is  convenient  to  let  denote  the  solution  operator,  so  We 

require  two  adjoint  problems  to  treat  the  reaction  component.  The  difference 
between  the  problems  is  the  function  around  which  they  linearized, 

>  t  ^  tn-l-.  M  gg^ 

If  $(,(2)  denotes  the  solution  operator  for  the  problem  linearized  around  a  func¬ 
tion  z.  then  we  have  <^i(t„_i)  =  and  We  can 

now  prove  (Estep  et  ai,  2008a). 

Theorem  1.63  A  hybrid  a  posteriori  -  a  priori  error  estimate  for  the  multiscale 
operator  splitting  dG  finite  element  method  is 


-  VNti’N) 


N  Mn  /  I- 
n=l  ms  I  '  ^m.n 


-h((n-n-l,«,T9':.-,,„-ni9^+_,,„) 


-u¥)dt 


+  ^  (y;._i,  (E,  -I-  £'2)^„)  +  0(At’-+2)  +  0(At  As’-'+i), 


E,  =  i At,,  (/l^^(y)  -  J^{Y)A'^)  ,  HY)  =  I  F'{Y)  dt, 

E2  =  (^;(y)  -  ^>;;(y"))  <■ 


where 
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The  first  expression  on  the  right  is  the  error  introdueed  by  the  numerieal 
solution  of  the  reaetion  eomponent.  Likewise,  the  seeond  expression  on  the  right 
is  the  error  introdueed  by  the  numerieal  solution  of  the  diffusion  eomponent. 
The  third  expression  on  the  right  measures  the  effeets  of  operator  splitting. 
The  expression  E\  is  a  loading  order  estimate  for  the  effeets  of  operator  splitting 
while  E2  aeeounts  for  issues  arising  from  the  diffcrcnecs  in  linearizing  around  the 
global  computed  solution  as  opposed  to  the  solution  of  the  reaction  component, 
which  affects  the  formulation  of  the  adjoint  problems.  Both  of  these  quantities 
arc  sealed  by  the  solution  itself,  so  that  these  effects  become  negligible  when 
the  solution  approaches  zero.  Finally,  the  remaining  tenns  represent  bounds  on 
terms  that  arc  not  computable  but  arc  higher  order.  In  practice,  we  neglect  those 
terms  when  computing  an  estimate. 

Using  the  estimate  requires  the  solution  of  five  adjoint  problems.  But  we  avoid 
the  need  to  solve  an  adjoint  problem  corresponding  to  linearization  around  the 
true  solution  by  deriving  the  hybrid  estimate. 

1.6. 2. 3  Numerical  examples  We  describe  some  examples  in  (Estep  et  ai.  2008a). 
Example  1.64  The  first  example  is  partial  differential  equation  version  of  Ex.  1.5, 

[lr-OO50=u2,  a:€(0,l).f>0, 

<  u(0,f)  =  w(l,f)  =  0,  t>0, 

[u(x,  0)  =  4x(l  —  x),  x6(0,l). 

The  solution  of  the  reaction  component  exhibits  finite  time  blow  up  when  un¬ 
damped  by  the  diffusion  component.  This  is  perhaps  the  most  extreme  form  of 
instability.  For  this  computation,  we  use  20  spatial  finite  elements.  Table  1.4 
shows  the  ratio  of  the  error  to  the  estimate  computed  at  the  final  time  T  =  1.  In 
this  computation,  we  keep  the  reaction  time  step  constant  and  vary  the  diffusion 
time  step  and  number  of  reaction  time  steps.  We  see  that  the  estimate  is  very 
accurate  for  a  range  of  time  steps. 

Table  1.4.  Operator  splitting  error  estimate  for  the  blow  up  problem  at  T  =  1, 
reaction  time  step  =  10“^ 


At 

M 

Exact  Err  {%) 

Error/Estimatc 

10"' 

100 

11.07 

1.0286 

10-'^ 

10 

1.35 

1.0067 

10”^ 

1 

0.45 

1.0020 

Example  1.65  We  next  consider  the  Brussclator  problem  (1.4)  with  q  =  2, 
P  =  5.45,  ki  =  0.008,  k2  =  0.004  and  initial  conditions  Ui(x,0)  =  a  +  0.1  sin(7rx) 
and  U2(a;,0)  =  (3/a  -t-  0.1sin(7rx),  which  yields  an  oscillatory  solution.  In  this 
ease,  the  reaction  is  very  mildly  unstable,  with  at  most  polynomial  rate  accumu¬ 
lation  of  perturbations  as  time  passes.  We  use  a  32  node  spatial  finite  element 
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discretization,  resulting  in  an  differential  equation  system  with  dimension  62. 
We  note  that  in  original  form,  the  reaction  terms  do  not  satisfy  the  requirement 
^(0)  =  0  so  we  linearize  around  the  steady  state  solution  c  with  with  c;  =  a  for 
i  ■=  I,  -  ■  ■  ,  Ne  —  I  and  c;  =  /3/a  for  i  =  N^,  •  •  •  ,  2Ne  —  2,  so  that  F{c)  =  0. 

Fig.  1.30  compares  the  errors  computed  using  At  =  0.01  and  M  =  10  reaction 
time  steps  to  the  hybrid  a  posteriori  error  estimates.  We  show  results  for  [0,  2], 
when  the  solution  is  still  in  a  transient  stage,  and  at  T  =  40  when  the  solution 
has  become  periodic.  All  the  results  show  that  the  exact  and  estimated  errors 
arc  in  remarkable  agreement. 


Fig.  1.30.  Brussclator  results.  Left:  Comparison  of  errors  against  the  spatial 
location  at  T  =  2.  Middle:  Time  history  of  errors  at  the  midpoint  location 
on  [0,2],  Right;  Comparison  of  errors  against  the  spatial  location  at  T  =  40. 
The  dotted  line  is  the  exact  error  and  the  (+)  is  the  estimated  error 


1.6.3  Multiscale  decomposition  of  a  fluid-solid  conjugate  heat  transfer  problem 

Following  (Estep  et  al,  20086),  we  next  consider  the  multiscalc  decomposition 
solution  of  the  heat  transfer  problem  described  in  Example  1.3.  The  weak  for¬ 
mulation  of  (1.5)  consists  of  computing  u  €  V]p,  p  6  2>  6  Wp  and 

Ts  6  l-Ks  such  that 

ai(u,v)  -t-  C\{u,.u,v)  -I  b{v,p)  -I  d{TF,v)  =  (f,v), 

•  b{u,  q)  =  0,  (1.90) 

a2{TF,WF)  ■\-C2{u.TF,WF)  +  a3{Ts,ws)  =  {Qf,‘Wf)  F  {Qs,ws), 


for  all  V  6  9  £  LI{Uf),  wf  €  Wf,o  and  ws  €  W/s.o,  where 


Oi  (u,  v)  =  p(Vu  :  Vv)  dx, 
a3{Ts,ws)  =  ksiVTs  ■  Vws)  dx, 
Ci(u.V,z)  =  fn  Pij  (u  ■  V)v  ■  z  dx, 
d(T,v)  =  poflTg  ■  V  dx. 


“2(2ViU'f)  =  Jq^  kpi'VTF  ■  Vwf)  dx 
l>(v,q)  =  -f^Jv  ■v)q  dx, 

C2(u,T,w)  =  Iuf  PoCp[u  ■  VT)  V!  dx, 
/  =  Po  (1  +  PTo)  g, 


and 
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Vf  =  [v  £  H'^{U.f)  \  V  =  pu,D  on  Pu.d}  ,  V>,o  =  {n  e  VV  |  n  =  0  on  Pu.d}  , 

Wf  =  {w  G  //'(fif)  1  uj  —  gTp,D  on  PTf  ,d}  ,  =  {’n  G  ^^F  [  in  =  0  on  Tr^.o}  , 

VPs  =  {in  €  H^{Qs)  |in  =  9Ts,d  on  Pts.d}  ,  VKs.o  =  {w  €  H'^s  |tn  =  0  on  Pts.d}  • 
To  discretize,  we  constnjct  independent  locally-quasi-uniform  triangulations 
Tf,Ii  and  Ts,h  of  PIf  and  fls  respectively.  We  use  the  piecewise  polynomial  spaces 

Vp  =  {v  e  Vp  \  V  eontinuous  on  ilp.  vi  €  P^(K)  for  all  K  G  tf,a}  .. 

=  {z  €  Z  \  z  continuous  on  z  €  P^[K)  for  all  K  G  , 

Wp  =  {ro  G  Wp  I  w  continuous  on  Up,  w  G  P^{K)  for  all  K  G  fF./i}  ■. 

I  eontinuous  on  its-.  6  P^{P)  fo''  all  K  G  . 

and  the  associated  subspaces 

=  {«  e  K''  I  «  =  0  on  r„,o}  , 

=  (w  e  I  in  =  0  on  Pr^.o}  , 

=  {rw  €  I  in  =  0  on  and  in  =  0  on  P;}  , 

where  F‘'(A')  denotes  the  space  of  polynomials  of  degree  q  on  an  clement  K. 
Note  that  Is  different  than  I’Vpo  hi  an  important  way  since  Pj^.o  does  not 
include  P/.  VVe  let  ny,  irwy.  and  ttz  be  projections  into  Vp,  Wp-  Wp  and 
Z^  respectively.  We  also  use  iriv^  and  to  denote  projections  into  l-P^f  and 
VP5  respectively  along  the  interface  P;. 


Algorithm  4  Multiscale  Decomposition  Method  for  Conjugate  Heat  lYansfer 
k  0 

while  -  nsTlP^lr,  >  TOL)  do 

k  =  k+1 

Compute  Tg^P  €  such  that  =  '!^WsTp‘h^^  along  the  interface  P/ 
and 

a3{T^';P,w)  =  {Qs,w),  VweWpo,  (1.91) 

Compute  ujp^  G  Vp,  pjp^  G  Z'‘  and  G  IP^  such  that 

+  Ci(ul’"^,ufpKv)  +  b{v,p\p^)  +  d{TP'^P,v)  =  {f,v),  \/v  G  Vpg, 

,  b(ui^\q)=Q,\/qeZ\ 

azirP^P ,w)  +  C2{ujp\TP'^P ,w)  =  {Qf,w)  -  {ks{n  ■  VTP'^P),w)r, , 

Vw  G  kP^g. 
(1.92) 

end  while 


To  compute  a  stable  solution  of  the  fluid  equations,  we  choose  Vp  and  Z^  to 
be  the  Taylor-Hood  finite  clement  pair  satisfying  the  discrete  inf-sup  condition 
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inf  sup 
96^''  11*^ 


b{v,  q) 


>  /3>  0. 


(1.93) 


Wc  also  note  that  the  convergence  of  the  iteration  defined  by  this  Algorithm 
depends  on  the  values  of  ks  and  kp  along  the  interface  and  the  geometry  of  eaeh 
region.  Often,  a  “relaxation"  is  used  to  help  improve  convergence  properties.  VVe 
ehoose  a  €  [0, 1)  and  impose  the  relaxed  Diriehlet  interface  values 

+  (1  -  a)7rFTj‘'“*\ 

This  affects  the  analysis,  but  wc  do  not  discuss  that  here,  sec  (Estep  et  al.,  2008a; 
Estep  et  ai,  20086). 


l.G.3.1  Description  of  an  a  posteriori  error  analysis  VVe  define  the  adjoint 
using  the  standard  linearization  approach.  VVe  define  the  errors 

e„  =  li  -  Cp  =  p  -  pf  ctf  =  7>  -  and  67-^  =  Ts  - 
The  adjoint  problem  for  the  quantity  of  interest 

(^,e)  =  itpu,eu)  +  (iip-.e-p)  +  (iPtf ■.  err)  +  ii>Ts,eTs) 
for  the  coupled  problem  (1.5)  is 


-pA0  + cI(<A)  +  Vz  + C2„(5f)  =  V’ui 

X  6  Clp, 

-V  •  0  =  0p, 

X  6  Clp, 

-kpAdp  +  c^piOp)  +  po0{g  ■  0)  =  0Tf: 

X  6  Qp, 

idp  =  6s, 

'T  C  P  » 

|A;F(n  •  Vdp)  =  ks(n  •  V6s), 

X  iz  1  li 

—ksAOs  =  0rsi 

X  6  Qs, 

with  adjoint  boundary  conditions 


•0* 

II 

p 

X  6  Tu  /j, 

50 

X  €  Tu^;V  1 

dp  =  0, 

X  €  Tt-f.Dj 

kp[n  •  ^9p)  =  0. 

X  € 

=0, 

X  e 

ks{n  •  V$s)  =  0, 

X  €  rrs.Af- 

Here,  wc  have  used  the  linearizations 


(1.94) 


(1.95) 


Cl  {4>)  =  ■^Po  V  {u  +  Uk)-4>-  ^Pl)  (w  +  Uh)  •  y4>  -  ^Po  ( V  •  (u  +  Uh))  0, 

c|u(<5)  =  ^pocpV  (r  +  Th)  e, 
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Wc  solve  (1.94)  numerically  using  an  iterative  operator  decomposition  ap¬ 
proach  as  for  the  forward  problem.  These  iterations  arc  completely  independent 
of  the  forward  iterations.  In  (Estep  et  ai,  2008a;  Estep  et  ai.  20086),  wc  derive 
estimates  that  only  require  adjoint  solutions  of  the  two  component  problems. 

To  write  out  the  a  posteriori  error  representation,  wc  introduce  an  additional 
projection  ;  fP  — *  W^q  defined  such  that  for  any  node  Xi 


TtWs^siXi)  = 


^\Vs^s{Xi), 

0. 


i  F/, 
I,  €  F/, 


along  with  rra^s  =  rrvvs^S~^vVs^S'  these  projections  is  made  clear  in 

the  context  of  improving  accuracy,  sec  (Estep  et  ai,  20086)  and  remarks  below. 

Wc  can  now  prove  (Estep  et  ai,  20086), 

Theorem  1.66  The  errors  satisfy 

(tp,  e)  =  (/,  0  -  7rv(p)  -  nvM  -  cp  -  nyip) 

-  b{<p  -  TTv<l>,Ph)  -  t:v4>)  -  b{u[^\z  -  mz)  (1.96) 

+  {Qfi  —  FWf-Gf)  —  ,  0 F  —  '’tn'y0F) 

-  C2{u[^K,T^^^,9f  -  ^Wf0f)  +  (<5s>^s  -  FWs0s) 

-a-i{T^'i^,es-Fws0s)  (1.97) 

+  [Ts'h  -  ^sT^Flf^sin  ■  Ves))r,  +  -  4'a  >  ks{n  ■  V^s))r, 

(1.98) 

+  {ks{n-  VrJ|j,^),7rw'peF)j,^  +  (<3s; -  o.3{Tg''jl,nws0s)-  (199) 

The  contributions  to  the  error  arc 

•  Equations  (1.96)-(1.97)  represents  the  contribution  of  the  discretization 
error  arising  from  each  component  solve. 

•  Equation  (1.98)  represents  the  contribution  from  the  iteration. 

•  The  first  term  in  (1:99)  represents  contribution  of  the  transfer  error  while 
the  remaining  terms  represent  the  contribution  arising  from  projections 
between  two  different  discretizations. 

Example  1.67  Wc  consider  an  example  from  (Estep  et  ai,  20086).  For  the  flow 
past  a  cylinder  shown  in  Figure  1.2,  wc  solve  the  steady  non-dimensionalized 
Boussinesq  equations  in  the  fluid  domain  and  the  non-dimensional  heat  equation 
in  the  solid  domain.  To  simulate  the  flow  of  water  past  a  cylinder  made  from 
stainless  steel,  wc  set  the  dimensionless  constants  Pr  =  6.6  and  kji  =  30,  and 
choose  the  inflow  velocity  and  the  temperature  gradient  so  that, 


Re  =  75,  Pe  =  495,  Fr  =  0.001,  Ra  =  50. 
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The  temperature  gradient  is  imposed  by  setting  different  temperatures  along  the 
top  and  bottom  boundaries,  with  a  linear  temperature  gradient  on  the  inflow 
boundary,  and  an  adiabatie  eondition  on  the  outflow  boundary. 

Wo  show  results  for  two  quantities  of  interest.  The  first  is  the  temperature  in 
a  small  region  in  the  wake,  looated  approximately  one  ohannol  width  downstream 
of  the  oontor  of  the  oylindor  and  1/4  of  a  ehanncl  width  below  the  upper  wall. 
The  sooond  is  temperature  at  a  small  region  in  the  eentor  of  the  oylindor.  In  eaoh 
ease,  wo  derive  an  a  posteriori  bound  by  the  usual  methods,  and  base  adaptivity 
on  an  element  toloranoe  of  1  x  10“®. 

Wo  show  the  final  adaptive  meshes  for  the  flow  in  Fig.  1.31  and  for  the  solid 
in  Fig.  1.32.  For  the  first  quantity  of  interest,  the  flow  mesh  is  most  refined 
near  the  region  of  interest  and  upstream  of  the  region  of  interest,  loeating  more 
elements  between  the  oylindor  and  the  top  wall  than  the  oylindor  and  the  bottom 
wall  sinoo  the  flow  advooting  heat  to  the  region  of  interest  passes  above  rather 
than  below  the  oylindor.  The  solution  downstream  of  the  region  of  interest  ean 
bo  oomputed  with  less  aoouraoy  as  is  reeognized  by  the  eoarser  mesh.  For  the 
solid,  the  mesh  is  highly  refined  along  the  top  in  order  to  inerease  the  aeeuraoy 
of  the  normal  derivative  that  is  oomputed  in  the  solid  and  used  as  a  boundary 
eondition  in  the  fluid  oomputation.  Evidently,  the  normal  derivatives  olsew'horo 
on  the  intorfaoo  have  less  of  an  influoneo  on  the  first  quantity  of  interest.  For  the 


0 


Fig.  1.31.  Upper.  Final  adaptive  mesh  in  the  fluid  when  the  quantity  of  in¬ 
terest  is  the  temperature  in  a  small  region  in  the  wake  above  the  oylindor. 
Lower:  Final  adaptive  mesh  in  the  fluid  when  the  quantity  of  interest  is  the 
temperature  in  a  small  region  in  the  eenter  of  the  solid. 


sooond  quantity  of  interest,  the  mesh  is  highly  refined  upstream  of  the  oylindor. 
Wo  note  that  the  refinement  downstream  of  the  eylinder  oorresponds  olosely  to 
the  reoiroulation  region,  and  the  mesh  refinement  is  slightly  asymmetrie  about 
the  midpiano  of  the  ohannol  due  to  the  asjrmmetrie  initial  mesh.  The  mesh  in  the 
solid  is  refined  uniformly  near  the  bound^^•y,  reflecting  the  fact  that  the  error 
in  the  finite  clement  fliLX  makes  a  significant  contribution  to  the  error  in  the 
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quantity  of  interest. 


Fig.  1.32.  Left:  Final  adaptive  mesh  in  the  solid  when  the  quantity  of  interest  is 
the  temperature  in  a  small  region  in  the  wake  above  the  eylinder.  Right:  Final 
adaptive  mesh  in  the  solid  when  the  quantity  of  interest  is  the  temperature 
in  a  small  region  in  the  center  of  the  solid. 


1.6. 3. 2  Loss  of  order  and  flux  correction  The  meshes  shown  in  Fig.  1.31  and 
Fig.  1.32  are  highly  refined  near  the  interfaee.  This  reflects  the  fact  that  there  is 
significant  error  in  the  numerical  flux  passed  between  the  components.  It  turns 
out  that  this  pollutes  the  entire  computation,  so  that  overall  the  method  loses 
an  entire  order  of  accuracy. 

Example  1.68  We  apply  Alg.  4  to  the  steady  flow  of  a  Newtonian  fluid  in  a 
two-dimensional  channel  connected  along  one  boundary  to  a  solid  which  is  heated 
from  below  as  shown  in  Fig.  1.33. 


Fig.  1.33.  Left:  Computational  domain  for  motivational  example.  Right:  Tem¬ 
perature  fields  within  the  fluid  and  the  solid. 


The  Reynolds  number  (based  on  the  channel  width  and  the  flux  averaged 
inlet  velocity)  is  Re  =  2.5  and  the  thermal  conductivities  are  kf  =  0.9  and 
/cs  =  1  -I-  0.5sin(27rx)sin(27ry),  which  arc  chosen  so  that  the  solution  is  smooth, 
but  nontrivial.  The  temperature  fields  are  displayed  in  Fig.  1.33. 

We  solve  the  problem  iteratively  and,  to  approximate  the  error,  we  compute 
a  reference  solution  with  a  higher  order  method  on  the  same  mesh.  In  Fig.  1.34, 
we  eomparc  the  errors  in  the  temperature  fields  over  fls  U  52^  on  a  series  of 
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meshes  that  align  along  the  interface  F/. 


Fig.  1.34.  Comparison  of  the  mesh  size,  h,  versus  the  error  in  the  temper¬ 
ature  field  when  the  finite  clement  flux  is'  passed. 


We  see  that  the  solution  converges  at  a  second  order  rate,  rather  than  the 
optimal  third  order  rate.  This  loss  of  order  is  a  consequence  of  the  operator 
decomposition  as  the  computed  boundary  flux  obtained  from  the  finite  element 
solution  is  one  order  less  accurate  than  the  solution  itself  and  this  error  pollutes 
the  rest  of  the  computation. 

One  way  to  compensate  for  the  loss  of  order  is  by  refining  the  mesh  locally 
near  the  Interface.  Another  way  is  to  compute  the  particular  information,  in  this 
case  the  flux  on  the  Interface,  more  accurately.  It  turns  out  that  we  can  adapt  a 
post-processing  technique  called  flux  coiTcction  developed  originally  by  Wheeler 
(Wheeler,  1974)  and  Carey  (G.F.  Carey,  1985;  Carey,  1982)  to  recover  boundary 
flux  values  with  increased  accuracy. 

We  denote  the  set  of  elements  in  Ts,h  that  intersect  the  interface  boundary 
by  _ 

r^;  =  {/ver5.,.  |/cnr/?4  0}, 

and  consider  the  corresponding  finite  element  space 

=  {u  e  P^(K)  with  K  e  r^;^,  v{r,i)  =  0  if  r]i  i  F/}, 

where  {tj,-}  denotes  the  nodes  of  clement  K.  The  degrees  of  freedom  correspond 
to  the  nodes  on  the  boundary.  We  compute  €  E/,  satisfying 

=  {Qs,v)  -a3(r^^)*,v),  for  all  v  €  Ef,, 

where  solves  (1.91).  Since  the  dimension  of  the  problem  scales  with  the 
number  of  nodes  on  a  boundary,  it  is  relatively  inexpensive  to  solve. 

The  modified  Algorithm  is  given  in  Alg.  5. 
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Algorithm  5  Multiscalc  Decomposition  Method  for  Conjugate  Heat  Transfer 

with  Flux  Correction _ _ 

k  =  0 

while  (||rj^>  -  TTsT^^'^llr,  >  TOL)  do 

k  =  k+1 

Compute  e  1^5  such  that  T^']^  =  along  the  interface  T; 

and 

=  Vu.€H's%,  (1.100) 

Compute  e  T,h  solving 

=  iQs,v)-a^{T^J^;^,v)..  Vn  €  S,„  (1.101) 

Compute  6  Vp,  6  Z'‘  and  Tp^^  €  Wp  such  that 

”«i(wr\v)  +  Ci(4^'^,4*^l.t;)  +  6(n:p[*'’)+d(T/,J‘,J,n)  =  (/,n),  Vu  G  Vp^, 

•  ^K?\9)  =  0,  V^f  €  Z^, 

a2{TjP^ ,w)  +  C2{u[''\t^'^^ ,w)  =  {Qf,w)  -  ,  Vw  G 

(1.102) 

end  while 


It  turns  out  that  using  the  rceovered  boundary'  fiiLX  leads  to  a  cancelation  of 
the  “transfer  error”  term  in  the  error  representation  formula,  which  is  the  source 
of  the  loss  of  order.  The  new  theorem  reads  (Estep  et  al.  20086), 

Theorem  1.69  The  errors  satisfy 

(i/',e)  =  (/,  0-  7rv0)  -al(u;\'‘■^l^-  7rv0i)  -  oi(u/,*''\u[*\ 0  -  7rv0) 

-  b{4>  -  Trv4>,Ph)  -d{T^l,4>-nv<l>)  -  6(u[*^^  -  7^z^)  (1.103) 

+  {Qpt  —  o.2(Tpl ,  9p  —  FWf^f) 

—  <^2{ull'\Tp^,dp  —  nwi^Op)  +  {Qs;0s  -  FWs&s) 

-<13{T^sa^^s-^Ws0s)  (1.104) 

+  (4*/  -  FsT^p';i^..ks{n-Ves})p,  +  {^sT^fa 

(1,105) 

+  •  (1.106) 

Note  the  difference  in  (1,106)  compared  to  (1.99);  now  there  is  only  a  projection 
error  arising  from  a  change  of  scale,  without  any  transfer  error  expression. 

We  can  prove  that  using  the  recovered  flux  recovers  the  expected  cubic  order 
of  convergence,  sec  (Estep  et  at,  20086). 


76  Error  Estimates  for  Multiscale  Operator  Decomposition  for  Multiphysics  Models 


Example  1.70  The  recovered  accuracy  is  easily  demonstrated  by  considering 
the  adapted  meshes  produced  by  (1.103)-(1.106).  We  repeat  the  computations 
in  Ex.  1.67  using  the  modified  error  bound  with  the  recovered  flux  derived  from 
(1.103)-(1.106)  to  guide  adaptive  mesh  refinement.  We  show  the  final  adaptive 
meshes  for  the  solid  in  Fig.  1.35.  There  is  no  mesh  refinement  near  the  boundaries, 
indicating  that  the  flux  error  is  no  longer  dominant. 


Fig.  1.35.  Left:  Final  adaptive  mesh  in  the  solid  when  the  quantity  of  interest  is 
the  temperature  in  a  small  region  in  the  wake  above  the  cylinder.  Right;  Final 
adaptive  mesh  in  the  solid  when  the  quantity  of  interest  is  the  temperature 
in  a  small  region  in  the  center  of  the  solid. 


1.7  The  effect  of  iteration 

In  the  presentation  above,  we  have  minimized  the  effects  arising  from  the  solu¬ 
tion  of  nonlinear  and/or  fully  coupled  systems  by  carefully  choosing  the  models 
and  results  that  arc  discussed.  Referring  back  to  Fig.  1.3,  we  generally  expect 
multiscalc  operator  decomposition  to  require  a  number  of  iterations  between  the 
physical  components.  This  raises  additional  issues  that  need  to  be  addressed, 
c.g. 

•  The  convergence  of  the  iteration  is  always  paramount.  Note  that  the  con¬ 
vergence  is  strongly  affected  by  the  fact  that  we  arc  repeatedly  upscaling 
and  downscaling  infomiation.  Indeed,  this  may  even  affect  the  definition  of 
convergence,  c.g.  when  coupling  stochastic  models  to  continuum  models. 

•  When  iteration  is  required,  then  we  are  passing  information,  along  with 
error,  between  iteration  levels  as  well  as  physical  components,  and  this  re¬ 
quires  defining  additional  auxiliary  quantities  of  interest  and  corresponding 
adjoint  operators. 

•  The  “single  physics  paradigm”  means  that  in  practice  we  may  have  access 
only  to  adjoint  operators  associated  to  the  components,  but  not  to  the 
entire  system.  This  has  strong  consequences  on  which  contributions  to  the 
error  may  be  estimated. 


Conclusion 


77 


These  issues  are  discussed  in  (Estep  et  ai,  2008a;  Estep  et  ai,  20086;  Carey 
et  ai,  2008a;  Estep  et  ai.  2008a;  Estep  et  ai.  20086). 

1.8  Conclusion 

Multiphysics,  niultiscalc  models  present  significant  challenges  in  terms  of  com¬ 
puting  accurate  solutions  and  for  estimating  the  error  in  information  computed 
from  numerical  solutions.  In  this  chapter,  we  discuss  the  problem  of  computing 
accurate  error  estimates  for  one  of  the  most  cominon,  and  powerful,  numeri¬ 
cal  approaches  for  multiphysics,  multiscalc  problems  called  niultiscalc  operator 
decomposition.  This  is  a  widely  used  technique  for  solving  multiphysics,  miil- 
tiscalc  models.  The  general  approach  is  to  decompose  the  multiphysics  and/or 
multiscale  problem  into  components  involving  simpler  physics  over  a  relatively 
limited  range  of  scales,  and  then  to  seek  the  solution  of  the  entire  system  through 
some  sort  of  iterative  procedure  involving  numerical  solutions  of  the  individual 
components.  In  general,  different  eomponents  are  solved  with  different  numerical 
methods  as  well  as  with  different  scale  discretizations.  This  approach  is  appeal¬ 
ing  because  there  is  generally  a  good  understanding  of  how  to  solve  a  broad 
spectrum  of  single  physics  problems  accurately  and  efficiently,  and  because  it 
provides  an  alternative  to  accommodating  multiple  scales  in  one  discretization. 

In  the  first  part  of  this  chapter,  we  describe  the  ingredients  of  adjoint-based 
a  posteriori  error  analysis.  We  stress  the  need  to  aceuratcly  quantify  stability  of 
particular  information  to  be  computed  from  a  model  and  the  role  of  the  adjoint 
problem  for  this  purpose. 

TYirning  to  specific  examples  of  multiscale,  multiphysics  models,  we  illustrate 
the  general  observation  that  the  stability  properties  of  such  models  arc  exceed¬ 
ingly  complex.  This  heightens  the  importance  of  obtaining  accurate  information 
about  stability. 

We  then  describe  how  the  techniques  of  a  posteriori  error  analysis  can  be 
extended  to  multiscalc  operator  decomposition  solutions  of  multiphysics,  mul¬ 
tiscalc  problems.  While  the  particulars  of  the  analysis  vary  considerably  with 
the  problem,  there  are  several  key  ideas  underlying  a  general  approach  to  treat 
operator  decomposition  multiscalc  methods,  including: 

•  We  identify  auxiliary  quantities  of  interest  associated  with  information 
passed  between  physical  components  and  solve  auxiliary  adjoint  problems 
to  estimate  the  error  in  those  quantities. 

•  We  deal  with  scale  differences  by  introducing  projections  between  discrete 
spaces  used  for  component  solutions  and  estimate  the  effects  of  those  pro¬ 
jections. 

•  The  standard  linearization  argument  used  to  define  an  adjoint  operator 
associated  with  error  analysis  for  a  nonlinear  problem  may  fail,  requiring 
another  approach  to  define  adjoint  operators. 

•  In  this  regard,  the  adjoint  operator  associated  with  a  multiscalc  operator 
decomposition  solution  method  is  often  different  than  the  adjoint  associ- 
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atcd  with  the  original  problem,  and  the  differenee  may  have  a  significant 
impaet  on  the  stability  of  the  method. 

•  In  praetiee,  solving  the  adjoint  assoeiated  with  the  original  fully-eoupled 
problem  may  present  the  same  kinds  of  multiphysics,  multiscalc  challenges 
posed  by  the  original  problem,  so  attention  must  be  paid  to  the  solution 
of  the  adjoint  problem. 

Wc  explain  these  ideas  in  the  context  of  three  specific  examples. 
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In  this  paper,  we  analyze  a  multirate  time  integration  method  for  systems  of  ordinary  differential  equa¬ 
tions  that  present  significantly  different  scales  within  the  components  of  the  model.  The  main  purpose  of 
this  paper  is  to  present  a  hybrid  a  priori  -  a  posteriori  error  analysis  that  accounts  for  the  effects  of  pro¬ 
jections  between  the  coarse  and  fine  scale  discretizations,  the  use  of  only  a  finite  number  of  iterations  in 
the  iterative  solution  of  the  discrete  equations,  the  numerical  error  arising  in  the  solution  of  each  com¬ 
ponent,  and  the  effects  on  stability  arising  from  the  multirate  solution.  The  hybrid  estimate  has  the  form 
of  a  computable  a  posteriori  leading  order  expression  and  a  provably-higher  order  o  priori  expression.  We 
support  this  estimate  by  an  a  priori  convergence  analysis.  We  present  several  examples  illustrating  the 
accuracy  of  multirate  integration  schemes  and  the  accuracy  of  the  o  posteriori  estimate. 

O  2012  Elsevier  B.V.  All  rights  reserved. 


1.  Introduction 

In  this  paper,  we  analyze  a  multirate  numerical  method  for  a  sys¬ 
tem  of  ordinary  differential  equation  that  presents  significantly  dif¬ 
ferent  scales  for  the  rate  of  change  of  individual  components  of  the 
model.  A  multi  rate  method  employs  discretizations  on  significantly 
different  time  scales  for  different  components  of  the  problem.  For 
simplicity,  we  consider  a  model  that  can  be  decomposed  into  two 
vector-valued  components:  findy  =  (y,,y2)’^  €  R"  that  satisfies 

yi  =Fi(yi,y2).  fe(o,r|, 

>2  =  F2(yi,y2).  1 6(0,7],  (1) 

>'i(0)  =  g,,y2(0)=g2. 
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for  a  given  initial  condition  g  =  (gi.gz)^  Here,  yj  e  R"',!  =  l,2,n  = 
n,  -t-n2,  and  F=  (Fi.Fz)’^  6  K",  with  F,(y)  =  Fi(y,,y2)  e  R"',  1  =  1,2. 
If  F)  and  Fy  induce  significantly  different  rates  of  change  in  the 
respective  solution  components,  then  an  heuristic  consideration  of 
accuracy  suggests  that  it  is  most  efficient  to  solve  ( 1 )  using  small 
time  steps  for  the  fast  component  and  large  time  steps  for  the  slow 
component.  While  we  have  assumed  the  form  of  ( 1 )  in  which  the 
slow  and  fast  components  are  distinguished  for  the  sake  of  exposi¬ 
tion  but  we  do  not  use  knowledge  of  the  slow  and  fast  components 
in  the  analysis.  Indeed,  the  estimates  we  obtain  can  be  used  to 
determine  if  a  particular  identification  of  fast  and  slow  components 
is  correct.  Also,  the  method  and  analysis  extend  to  systems  with 
more  than  two  scales  in  a  straightforward  way. 

Such  multiscale  systems  arise  in  a  variety  of  applications,  e.g. 
fusion  and  fission,  reacting  flows,  circuit  analysis,  convection  prob¬ 
lems.  and  mechanical  systems.  As  a  useful  example,  we  consider  a 
discrete  model  consisting  of  a  wire  in  a  state  of  constant  tension  T 
supporting N  masses,  see  Fig  l.The  masses  m,-,  i  =  1 . Nhave  hor¬ 
izontal  positions  x,-,  i  =  1 . N  and  vertical  positions  y,-,  i  =  1 . N. 

The  horizontal  spacing  between  masses  is  ar  =  Xr-Xr_i. 
r  =  1 . N.  Applying  Newton's  second  law  and  assuming  the  ten¬ 

sile  forces  are  large  compared  with  the  gravitational  forces,  a  linear 
damping  model  and  that  the  masses  are  free  to  move  in  the  vertical 
dirertion  only,  we  have 

m,  ^  =  -r  sin(ffr)  -F  r  sin(0,+, )  -  2>>,  ^ , 
at 


004S-782S/$  -  see  front  matter  ©  2012  Elsevier  B.V.  All  rights  reserved, 
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Fig.  1.  A  model  of  masses  connected  by  a  wire  under  tension. 


where  Or  and  Or.i  denote  the  angles  (measured  positive  counter¬ 
clockwise)  between  the  horizontal  and  the  wires  conneaing  the 
masses  m.-i  and  nir,  and  m.  and  mr,i  respectively  as  illustrated. 

Without  loss  of  generality  we  assume  that  T  =  1  and 
=0-  To  create  a  system  with  two  time  scales,  we  consider 
two  heavy  masses  coupled  to  (N-2)  lighter  masses  with 
mo  =  mi  =M  >  m2  =  •  •■  =  mjv  =  m,  0]  =  02  =  /t »  Oj  =  •  ■  ■  =  a«  =  ON.|  = 
a,  see  Fig.  2(a).  and  define  r  =  y,M~'  =  y2M~'  and  y  =  y2tn~'  = 
■•■y/vm”'.  Making  the  small  angle  approximation  sin(0)s:0a 
tan(0)and  introducing  Vr  r  =  and  Vn.r^Vr. 

r  =  1 . N,  we  rewrite  the  system  as  a  2N-dimensional  system  of 

first-order  differential  equations 

D  A 

In«n  OnxN 

where 


dv  _  / 

V 


f-2{AM)-'  -(AM)-' 

(AM)-'  -{AM)-'-{aM)-' 

0  (am)"' 

A  = 

0 

t  0 


-2yln.2rH-2  J 


0  0 
0 

-2(am)"'  {am)'' 


0  \ 
0 
0 


(am)"'  -2(am)"'  (am)"'  I 
{am)''  -2{am)"'/ 


and  !„*„  denotes  the  n  x  n  identity  matrix  and  0„xn  is  the  n  x  n  zero 
matrix.  We  set  IV  =  7,  A  =  .25,  a  =  .1,  M=  10,  and  m  =  .01  and  plot  a 
typical  solution  in  Fig.  2(b),  The  two  scales  for  evolution  are  clear. 
We  also  show  the  pointwise  accuracy  for  several  multirate  numer¬ 
ical  solutions  in  Fig.  2(c).  The  accuracy  gained  by  using  smaller  time 
steps  for  the  fine  scale  Is  displayed,  as  is  the  fact  that  there  is  a  limit 
to  the  accuracy  that  can  be  gained.  Indeed,  using  a  multirate  ap¬ 
proach  affects  both  accuracy  and  stability.  Consider  the  case  of  a 
two-dimensional  system  ( 1 )  with  scalar  fast  yi  and  slowy2  compo¬ 
nents.  In  this  case,  we  can  write 


dy2 


and  solving  (1)  amounts  to  tracing  out  a  smooth  curve  with  slope 
/1//2  at  each  point.  Using  a  multirate  method  for  approximating 
the  change  (yi.y2)  -  CVi  +  Ayi,y2  +  Ay2)  amounts  to  replacing 
the  implicit  slope  -I- Ay,,y2  +  Ay2)  at  the  new  point  by 
^(Vi  +Ayi,y2)’  which  has  been  "frozen"  at  the  previous >2  value. 
Drawing  a  few  examples  provides  convincing  evidence  that  this 
affects  both  accuracy  and  stability  regardless  of  how  well  we 
approximate  the  step  between  CVi.yij)  and  (yi  +  Ayi,y2  +  Ay2)- 

There  is  a  significant  literature  on  multirate  numerical  methods, 
see  for  example  148,29,1,45,30,55,56,50,2,37,8,3,33,14,19,18,44, 
15,38,32,5,5 1 ,36,40,41 ,46,1 1 ,49,57,42,60,1 3,58,6 1 ,54,52,53,34,1 0|. 
By  and  large,  these  references  are  focused  on  application  and 
standard  a  priori  analysis  issues,  e.g.  stability,  accuracy,  and 
convergence  properties.  Such  analysis  is  not  generally  useful  for 
estimating  the  error  in  specific  numerical  solutions.  The  main  goal 
of  this  paper  is  to  derive  a  computable  a  posteriori  error  represen¬ 
tation  that  occiirotely  estimates  the  error  in  a  specified  quantity  of 
interest  computed  from  a  multirate  solution  of  (1). 

Our  analysis  adapts  a  well  developed  approach  based  on  dual¬ 
ity,  adjoint  operators,  and  variational  analysis,  see  for  example  in 
(43,39,22,20,21,26,7,31,4.6],  In  order  to  use  a  variational  frame¬ 
work  for  analysis,  we  represent  the  numerical  method  as  a  finite 
element  method.  In  this  paper,  we  consider  the  so-called  discon¬ 
tinuous  Calerkin  (dC)  method  (16,17, 59,22]  which  employs  piece- 
wise  polynomial  shape  functions.  We  can  recover  many  standard 
finite  difference  schemes  by  appropriate  choice  of  quadrature  ap¬ 
plied  to  the  integrals  defining  the  finite  element  solution.  Our  anal¬ 
ysis  also  applies  to  the  so-called  continuous  Calerkin  (cG)  method, 
which  yields  other  families  of  difference  schemes  upon  application 
of  quadrature  (24]. 

Our  approach  is  based  on  the  observation  that  a  multirate 
method  shares  some  features  of  multiscale  operator  decomposi¬ 
tion  methods  for  multiscale  problems  (43,9,27,28,25,23 ).  In  partic¬ 
ular,  the  need  to  project  the  approximate  solutions  between  the 
discretizations  at  different  scales  and  the  practical  use  of  incom¬ 
plete  iteration  when  solving  the  discrete  equations  has  effects  on 
accuracy  and  stability  similar  to  those  caused  by  operator  decom¬ 
position.  Indeed,  we  adapt  ideas  used  in  the  investigation  of  oper¬ 
ator  splitting  for  reaction-diffusion  equations  (25]  to  carry  out  the 
analysis.  In  particular,  we  define  adjoint  operators  for  both  the  ori¬ 
ginal  nonlinear  operator  and  the  multirate-discretization  operator 
independently  by  linearization  with  respect  to  a  solution  that  is  as¬ 
sumed  to  be  common  to  both  operators  and  obtain  analogs  of  the 
standard  Green’s  formula  representing  the  quantity  of  interest  rel¬ 
ative  to  the  common  solution.  We  then  carry  out  the  a  posteriori  er¬ 
ror  analysis  by  comparing  the  resulting  linear  expressions.  The  fact 
that  multiscale  operator  decomposition  affects  stability  is  directly 
reflected  in  the  form  of  the  estimate,  which  involves  the  difference 
between  certain  adjoint  operators  associated  with  the  solution 
operator  for  the  original  problem  and  the  multirate  -  operator 
decomposition  solution  operator.  A  practical  difficulty  with  such 


Number  of  FineTime  Steps  Per  LargeStep 


Fig.  2.  Typical  simulations  of  the  two  scale  system.  Left:  A  solution  at  a  fixed  lime.  Center:  The  components  at  y,  (slow)  and  yj  (fast)  versus  time.  Right:  Plots  of  pointwise 
accuracy  versus  the  fine  scale  time  step  for  three  coarse  scale  time  steps. 
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estimates  is  the  fact  that  it  is  unlikely  that  the  adjoint  for  the  ori¬ 
ginal  problem  is  available  since  it  poses  the  same  multiscale  chal¬ 
lenge  as  solving  the  original  forward  problem.  Therefore,  we  derive 
an  estimate  expressed  as  a  computable  leading  order  expression 
obtained  using  a  posteriori  arguments  with  remainder  that  cannot 
be  estimated  but  which  is  provably  higher  order.  We  call  this  a  hy¬ 
brid  0  priori  -  a  posteriori  estimate. 

Our  analysis  shares  some  aspects  of  the  results  in  the  penetrat¬ 
ing  series  of  papers  [40-42|  by  Logg.  In  those  papers,  he  carries  out 
an  0  priori  convergence  analysis,  an  a  posteriori  error  analysis,  anal¬ 
yses  the  convergence  of  a  fixed  point  scheme  for  the  numerical 
method,  and  constructs  an  adaptive  time  stepping  algorithm  for 
what  is  termed  multiadaptive  cG  and  dG  methods.  These  methods 
generalize  the  notion  of  multirate  schemes  to  allow  different  time 
steps  for  each  individual  component  of  a  system  of  differential 
equations  of  size  N. 

Our  results  differ  from  this  work  in  two  significant  ways.  First, 
Logg  does  not  consider  the  effects  of  projecting  between  the  dis¬ 
cretizations  at  different  scales  on  the  accuracy  of  the  approxima¬ 
tion.  If  projections  are  not  employed,  then  the  quadrature  used 
in  the  finite  element  formulation  should  be  carried  out  on  the  fast¬ 
est  time  scale  in  order  to  avoid  integrating  non-smooth  discrete 
approximations.  However,  this  greatly  increases  the  expense  of  a 
multirate  scheme.  In  practice,  a  projections  are  introduced  in  a 
de  facto  fashion  that  is  rarely  arranged  on  the  finest  time  scale, 
which  potentially  has  a  significant  effect  on  accuracy.  In  any  case, 
our  analysis  provides  the  means  to  quantify  the  effect  of  the  dispa¬ 
rate  time  scales  in  the  discretizations  on  overall  accuracy. 

Secondly,  Logg  does  not  consider  the  effect  of  incomplete  itera¬ 
tion  in  the  solution  of  the  discrete  equations.  That  is,  the  complete 
convergence  of  the  discrete  system  equations  that  must  be  solved 
at  each  "synchronization  time"  is  assumed.  In  practice,  it  is  com¬ 
mon  to  carry  out  only  a  few  or  fixed  number  of  iterations  (perhaps 
only  1 1),  and  the  fewer  iterations  that  are  used,  the  more  the  dis¬ 
cretization  scheme  behaves  like  an  operator  decomposition  meth¬ 
od,  Our  analysis  of  the  effect  of  incomplete  iteration  provides  the 
means  to  strike  a  balance  between  the  discretization  and  iteration 
errors. 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  2,  we 
formulate  multirate  Galerkin  finite  element  methods  for  (1).  We 
present  some  o  priori  convergence  results  in  Section  3.  The  main 
results  on  the  o  posteriori  analysis  of  the  multirate  method  are  pre¬ 
sented  in  Section  4  followed  by  several  numerical  examples  in  Sec¬ 
tion  5.  In  Section  6,  we  give  the  details  of  proof  of  the  a  priori  result. 
Finally,  in  Section  7,  we  present  a  conclusion. 

2.  A  multirate  Galerkin  finite  element  method 


V“'->(fn)  =  {U  :  U|,,^  e  1  ^  ^  il.n}, 

=  {U  :  Ul,^  6  1  ^  <  i2,n}. 

The  multirate  Galerkin  finite  element  solution  is  to  find  V'=  (Vi  ^2)^ 
with  Vil,^  g  v‘'''’(/„)  and  ^21),  6  satisfying 

/  (y,-F,(y,,y2),v)cft+([y,i,_,,„.v,t,)=o,  w g 

(3) 


for  1=  1,2 . L|^,  and 

{Y2-F2{nYuY2),W)dt+{[Y2],.,,„,Wl,)=0,  VWgT>to>(/t,n). 

(4) 


for  k  =  1,2 . L2j,.  Here  /7  is  a  projection  operator  between  the  two 

different  meshes.  In  the  simplest  case,  n  =  /  implies  that  the  integra¬ 
tion  in  (4)  is  carried  out  on  the  finest  mesh.  As  noted  in  [9]  this  pro¬ 
jection  can  become  a  significant  source  of  error  whenever 
information  is  transferred  between  two  different  discretizations 
and  its  effects  should  be  estimated  separately  as  a  potentially  signif¬ 
icant  component  of  the  total  error.  Eqs.  (3)  and  (4)  together  comprise 
a  system  of  sizeL,  n(g,  +  1)  +  L2.n(q2  +  1)  which  is  solved  implicitly. 

We  note  that  many  standard  finite  difference  schemes  can  be 
obtained  by  applying  appropriate  quadrature  formulas  to  the  inte¬ 
grals  defining  the  finite  element  approximation.  A  straightforward 
modification  of  the  analysis  below  extends  the  results  to  such  dif¬ 
ference  schemes. 


2.2.  An  iterative  multirate  Calerkin  finite  element  method 

Without  loss  of  generality,  we  assume  L,,„  =  d„L2,n  for  some  po¬ 
sitive  integer  d„,  i.e.,  L,,„  is  divisible  by  L2J,.  The  iterative  multirate 

Galerkin  finite  element  is  to  create  a  sequence  V'*"’'  =  (y',"’’ 

with  g  and  yj"*’!;,  g  V'’''(/„)  determined  by 

Algorithm  1. 


Algorithm  1:  Iterative  multirate  Galerkin  finite  element 
method 

for  n  =  1  to  N  do 
Setvf>  =y^'^-’>(t-_,) 
for  m  =  1  to  M„  do 
Sety("’>(t-.,)  =  y("-'>(t-.,). 
for  /=  1  to  Li.n  do 

Compute  y’l^'ft)  for  t  g  /,„  satisfying 


2. 1.  A  fully  implicit  multirate  Colerkin  finite  method 

We  discretize  [0,7]  into  0  =  to  <fi  <  t2<  ■  •  •  <  t^  =  7  with  time 
steps  At„  =  tn-tn_i,  At  =  maxi<n<N(At„),  and  time  intervals 
In  =  (tn-i,tn).  Let  Li.„,  1  =  1,  2  be  two  positive  integers,  where  Lij,  de¬ 
notes  the  number  of  time  steps  used  to  solve  the  fast  subsystem 
and  L2.„  the  number  of  steps  used  for  the  slow  subsystem.  We 
denote  time  steps  for  each  component  in  the  Galerkin  formulation 
by  As.-j,  =  Atn/Lfn,  with  As,- =  maxi<n<w(As,„).  Within  /„  we  set 

f/.n  =  (t/-i.n,t,.n)  With  t|.„  =  t„.,  +  (Asij,.  fot  1  =  0,1, 2 . and 

/k.n  =  (tk-i.n,tM)  with  tk.„  =  t„_,  +kAS2.n,  for  k  =  0,l,2 . Lij,. 

We  use  an  extension  of  the  discontinuous  Galerkin  method  [26], 
which  is  an  appropriate  method  for  dissipative  problems.  The 
analysis  naturally  extends  to  the  continuous  Galerkin  method, 
which  is  more  appropriate  for  problems  with  conserved  quantities 
|26j.  The  finite  element  approximate  solutions  are  sought  in  piece- 
wise  polynomial  spaces. 


(yr>  -  f,(y<"".  yf-'>).  v)dt  +  ([yr‘l.-,,n.  K,)  =  o,  (5) 

for  all  V  g 

end  for 

for  k  =  1  to  L2,„  do 
Compute  y'^'ft)  satisfying 


[Yfi  -  F2(/7y<"’>.  yf >),  w)  dt  +  ([yf wt,)  =  o, 


(6) 


for  all  W  g  P'‘'=>('k.n). 

end  for 
end  for 
end  for 
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3.  A  priori  theory 

Under  appropriate  assumptions,  the  solution  of  (1 )  exists  and  is 
unique  in  (0,T],see  (12,47).  An  a  priori  analysis  fora  multi-adaptive 
Galerkin  solution  of  (3),  (4)  is  provided  in  (42].  For  completeness, 
we  present  a  short  a  priori  convergence  analysis  that  is  required 
for  the  hybrid  o  posteriori-o  priori  estimate  for  the  iterative  approx¬ 
imation  produced  by  Algorithm  1. 

3. 1.  An  analytic  iterative  method 

As  a  tool  for  analysis,  we  consider  a  theoretical  approximation 
obtained  by  iterating  analytic  solutions  of  the  fast  and  slow  com¬ 
ponents.  Specifically,  we  lety*'"'  =  denote  the  analytic 

approximation  of  (1 )  at  iteration  level  m  obtained  using  the  itera¬ 
tive  procedure  defined  in  Algorithm  2. 


Algorithm  2:  Analytic  fixed  point  iteration 

for  n  =  1  to  N  do 

Sety'"'  =y^”'-'(t„-,) 
for  m  =  1  to  M„  do 

Compute  y*|'"*(f)  for  f  €  /„  satisfying 

ry';)  =  F,(yf,y^") 

(7) 

Compute  y^'"’(f)  for  f  e  /„  satisfying 

fj'r>=F2(y:',yr') 

(8) 

end  for 
end  for 

We  begin  the  o  priori  analysis  by  analyzing  the  convergence 
of  the  method  in  Algorithm  2  over  /n  =  The  analysis  is 

focused  on  determining  a  time  interval  l„  over  which  the  itera¬ 
tion  converges  to  the  correct  solution.  For  this  purpose,  we  as¬ 
sume  that  the  solution  from  previous  time  level  /„-i  has  been 
obtained  exactly,  i.e.,  y'"’(tn-i)  =y(fii-i).  Following  the  classic 
method  of  successive  approximations,  we  integrate  (7)  and  (8) 
to  get 


3.2.  Convergence  of  the  iterative  multirote  Colerkin  finite  element 
method 

We  now  turn  to  the  o  priori  error  analysis  of  the  numerical  solu¬ 
tion  described  in  Algorithm  1 .  An  unusual  feature  of  this  problem  is 
the  numerical  solution  involves  an  alternating  sequence  of  V'*,'"'  and 
yj"*,  which  result  from  consistent  numerical  discretizations  of  (7) 
and  (8)  respectively.  The  analysis  is  carried  out  using  the  analog 
of  the  standard  local  error  analysis  for  a  finite  difference  scheme. 
For  each  component  solution  on  each  interval,  we  decompose  the 
error  into  a  sum  of  the  error  in  the  initial  condition  inherited  from 
the  previous  component  solution  and  the  error  of  the  numerical 
solution  of  the  component  assuming  exact  initial  conditions  on 
the  current  interval.  We  describe  the  main  result  below  and  give 
the  detailed  proof  in  Section  6. 

Theorem  3.2.  Lety\"'^  ondy^"’’  be  the  solution  of  ano/yt/c  fixed  point 
iteration  governed  by  (7)  and  (8).  respectively,  and  y‘,'"’  and  y^'"'  be 
their  dC  finite  element  approximation  with  n=  the  identity.  Then  the 
finite  element  error  ot  the  fino{  time  for  qi  =  0,  1  ond  q2-  0,  I  is 
bounded,  and 

le(“»)(t^;)|  «0(as’'-^')  +0(as’^"’)  +0(|r2|), 

where  r2  =  maxi(„<N hi””’  ,  and  r^"'  is  the  iteration  residual  writ¬ 
ten  in  Lenimo  6.3. 

Notice  that  there  is  a  term  quantifying  the  iteration  residual, 
namely  rj"'.  Provided  the  sequence  of  solutions  are  driven  to  pro¬ 
duce  small  iteration  residual  relative  to  the  errors  produced  by  the 
finite  element  solution,  then  this  term  is  negligible. 

4.  A  posteriori  analysis 

It  turns  out  that  is  feasible  to  treat  the  effects  of  projecting  be¬ 
tween  the  discretizations  at  different  scales  and  the  use  of  finite 
iterations  in  the  iterative  solution  of  the  discrete  equations  sepa¬ 
rately,  and  doing  so  greatly  simplifies  notation. 

4.1.  Analysis  for  the  implicit  multirote  method 

To  define  the  adjoint,  we  set  z  =  sy  +  (1  -s)y,  withs  6  (0,1)  and 
then  let  F (z)  be  a  matrix  whose  entries  are 

Consequently,  F(y)  -F(Y)  =  F'(z)(y  -  Y).  We  note  that 
-e  +  =  {-y  +  F(y))  +  (y  -  F{Y))  =  y  -  F{Y). 


J'^'"’(f)=y2(fn-i)+  f  F2(y<"->,y<"'>)ds, 


(9) 


Furthermore,  using  continuity  ofy. 


—  ['Hl-l.n 


Any  continuous  functions  satisfying  (9)  also  satisfies  (7)  and  (8). 
Hence,  the  goal  is  to  first  show  that  each  of  the  integral  equa¬ 
tions  in  (9)  has  a  solution  for  a  fixed  m.  We  then  proceed  to 
show  that  the  iteration  solving  (9)  (and  thus  (7)  and  (8))  con¬ 
verges  to  the  exact  solution  of  (1).  Details  of  this  investigation 
are  presented  in  Section  6.  The  main  result  is  the  following 
theorem. 

Theorem  3.1.  There  exists  o  time  t„  >  f„_/  such  that  the  sequence  of 
functions  |y*|'"*|  and  jyj'"’}  produced  by  the  integral  Eq.  (9)  converge 
to  the  exact  solution  o/fl  j  on  time  interval  I„  =  [t„^,.  t„j. 


Associated  with  the  finite  element  solution,  we  denote  by  i?  the  gen¬ 
eralized  Green's  function  that  satisfies  an  adjoint  problem 

=  t„>t>t„_, 

On  time  interval  l|,„,  (=  1,2 . L,.n, 

0  =  j^  (e,ij  +  F(z)^t?)df 

=  [elnAn)-[eU.A-ip)+ j  i-e^meAdt.  '  (11) 
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Combining  all  these  expressions  yields  the  recursive  relation 
-  /  (Y  -  F(Y),,})dt 

■'kti 

■  ■  (12) 


Theorem  4.1.  The  implicit  multirote  Galerkin  finite  element  solution 
sotisfies  the  error  equation  over  one  time  step  l„: 

=  (en-li’5n-l)  +  6l,n  +  Q2,n  +  Q//,n 

where 


S2,n  =  ~  ~  F2(FlYi,Y2'^,d2)dt  +  ^[y2l 

S//."  =  E  /  iF2{YuY2)-F2(nYuY2),d2)dt. 


Furthermore,  by  setting  =  lA  forogiven  ij/  ond  =  i3„_,/orn  =  N, 
N-  the  error  of  implicit  multirate  Galerkin  finite  element 

method  at  final  time  tn  =  T  can  be  expressed  as 

(e;},>l>)  =  '£(Qi2<  +  Q2.n  +  Qu.n).  (13) 

nsl 

The  term  Qi,„  represents  the  finite  element  residual  associated 
with  the  fast  time  scale  subsystem,  while  Qj.n  represents  the  finite 
element  residual  associated  with  the  slow  time  scale.  The  term 
Q„,„  represents  the  effect  of  projection  of  V,  from  the  fast  to  the 
slow  time  scale.  We  use  Gi,n,Q2,niQi;.n  to  distinguish  these  terms 
from  the  closely  related  but  distinct  terms  Q,  ,,.  Q2.n.  Qji.n  appearing 
in  Theorems  4.2,  4.3,  and  4.4  below. 


4.2.  Analysis  for  the  iterative  multirote  method 


(pit)  =  <!’nit)>l>„, 

for  t„>  t  ^  tn.i  and  some  initial  data  i//„.  We  can  obtain  a  solution 
representation  using  the  Green’s  functions,  by  multiplying  y  with 
the  adjoint  Eq.  (15)  and  integrating  in  time,  integrating  by  parts, 

(y„.W  =  (y„-,,<i’n(fn-i)W.  (16) 

4.2.1.  Analysis  of  the  analytic  fixed  point  iteration 

To  simplify  presentation,  we  express  the  analytic  fixed  point 
iteration  in  Algorithm  2  in  a  more  compact  format.  In  particular, 
for  any  iteration  index  m,  we  write  (7)  and  (8)  as 

y<'">  =  F(y("’))+d<,'">,  (17) 

where 

4""  =  -[Fi(yr.yr’)  (18) 

The  vector  a*"'*  can  be  interpreted  as  a  residual  at  the  iteration  level 
m. 

To  define  an  adjoint  for  the  analytic  fixed  point  iteration  in 
Algorithm  2,  we  let  denote  the  generalized  Green’s  function 
that  satisfies  an  adjoint  problem  on  time  interval  /„  as  given  in 
Algorithm  3. 

Algorithm  3:  Adjoint  for  the  analytic  fixed  point  iteration 

Set  tpf^  = 
for  k  =  1  to  K„  do 
Compute  (/>‘2  *  satisfying 


-Gf  =  F'22(y'"'>)>2‘’  +  F'l2(y''">)>l'“”.  tn  >  t  >  tn-1 

*P2*(l'i)  =  l/'2,nt 

(19) 

Compute  satisfying 


To  simplify  notation,  we  now  assume  that  the  projection  11=  the 
identity.  As  discussed  above,  a  key  feature  of  the  analysis  is  the 
realization  that  the  analytic  fixed  point  iteration  is  naturally  asso¬ 
ciated  with  a  different  adjoint  operator  than  the  original  problem. 
Our  approach  (25)  to  overcome  this  issue  is  to  use  a  different  lin¬ 
earization  than  commonly  used  for  nonlinear  problems.  We  as¬ 
sume  that  the  operators  for  the  original  problem  and  the  analytic 
fixed  point  iteration  share  a  common  solution,  and  use  that  as  a 
linearization  point.  The  simplest  example  of  such  a  solution  is  a 
steady-state  solution,  which  can  be  guaranteed  to  exist  by  assum¬ 
ing  homogeneity  in  the  right-hand  side,  i.e., 

F(0)  =  0. 

This  is  generally  not  restrictive  in  practice,  but  this  assumption  can 
be  generalized  (see  (25)).  We  let 

faH)  =  ^^isy)ds,  ij  =  l,2,  (14) 

and  r  denotes  the  square  matrix  whose  entries  are  (14).  Then 
Fly)  =  r(y)y  and  so  y  =  F'(y)y.  Associated  with  this  linearized  form, 
we  denote  by  tp  the  generalized  Green’s  function  satisfying  the  fol¬ 
lowing  adjoint  problem: 

-(p  =  F^'^(p,  te(T,01,  ,,5. 

(p{T)  = 

On  subinterval  ln  =  (t„-i,t„),  we  define  the  solution  operators  <P„ 
associated  with  the  Green’s  function. 


=  f;, (yi"’))^'/’ + f'2,(y'"’>)>2’.  t„  >  t  Sr  t„., 

d’f’ftn)  =  l/'l.n, 

(20) 

end  for 

Notice  that  the  adjoint  problems  are  solved  backward  in  time 
and  in  the  reverse  order  to  that  of  the  forward  problem,  starting 
with  (plf*  followed  by  These  generalized  Green’s  functions 
are  an  iterative  approximation  of  (15).  As  in  the  forward  problem, 
we  can  also  rewrite  this  last  algorithm  into  a  compact  form 

=  F'(y('"l)^ -I-  ,  (21) 

for  adjoint  iteration  level  k.  Here 

cw  =  -[o,F^;o^^((p<''>  -  (pf-'^)]\ 

is  the  residual  of  the  adjoint  at  iteration  level  k.  We  also  introduce 
the  solution  operators  if®,  with  (p<*>(f)  =  f® (f)i/f„.  for  t„>  t  f„_i. 
To  get  a  representation  of  the  iterative  implicit  solution,  we  follow  a 
similar  derivation  as  for  the  forward  problem.  Multiplying y*"'  with 
(21).  integrating  each  over  /„,  and  applying  integration  by  parts,  we 
obtain 

(yf’.-kn)  =  (-y®’  +F^y''"),(p®)dt 

-  (y®),^®)dt. 
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Using  (17),  we  obtain  the  solution  representation  of  the  analytic 
iterative  implicit  method 

1  (<>,(()'''')dt- (yW.c'*!)*.  (22) 

We  note  that  this  representation  is  not  in  the  standard  format  (in 
which  the  solution  at  the  current  time  level  solely  depends  on  the 
previous  time  level  values).  It  contains  artifacts  arising  from  the 
iterative  procedure  used  to  compute  both  forward  and  backward 
problems.  The  second  term  can  be  interpreted  as  the  weighted  aver¬ 
age  of  the  forward  problem  residual  over  a  time  step.  The  third 
term,  on  the  other  hand,  is  the  weighted  average  of  the  backward 
problem  residual  over  a  time  step.  Thus,  the  iterative  nature  of  solu¬ 
tion  procedure  is  reflected  in  this  representation.  Once  convergence 
is  reached  both  on  forward  and  backward  problems,  then  the  stan¬ 
dard  convention  of  solution  representation  using  the  adjoint  tech¬ 
nique  is  recovered. 

We  are  now  able  to  express  the  error  representation  of  the  iter¬ 
ative  implicit  method.  First,  we  state  a  lemma  concerning  an  error 
equation  over  one  time  step. 

Lemma  4.1.  The  analytic  fixed  paint  iteratian  satisfies  the  following 
error  equation  aver  ane  time  step: 

(y„  -yr’.'^n)  =  (y„-„  A<i(t„.,).A„)  - 1  * 

+  (y('"),eW)dt 

where  =  'i’n  - 

Proof.  This  lemma  is  derived  by  subtracting  (22)  from  (16)  and 
setting  yj,'!’  =y„.|.  □ 

Note  that  there  are  terms  that  are  not  computable  in  this 
expression.  The  term  is  definitely  not  computable,  though 
when  convergence  in  the  adjoint  computation  is  reached,  this  term 
vanishes.  Nevertheless,  in  the  context  of  finite  number  of  itera¬ 
tions,  we  desire  to  quantify  This  is  made  more  precise  below. 

4.2.2.  Analysis  of  the  iterative  multirate  Calerkin  finite  element 
method 

To  setup  the  adjoint,  let  °  sy* (1  -s)y‘'"*.  with  s  6  [0,1). 
Then  let  F'fz*™!)  be  a  matrix  whose  entries  are 

FWh  =  I' 

Consequently,  F(y<'"l)  -  Ffy*™’)  =  F'(z'")(y<'"l  -  V'*'"').  Associated 
with  the  finite  element  solution,  we  denote  by  .  a 

sequence  of  generalized  Green’s  function  that  satisfies  an  adjoint 
problem 

Algorithm  4:  Adjoint  for  the  iterative  multirate  Calerkin 
finite  element  method 

Set 

for  k  =  1  to  K„  do 
Compute  satisfying 

f  -ijf  =  F'22(z('"))^i5®  +  t„>t  »  f„_, 

\d«(t„)=^2,„. 

(23) 


=  F, ,  (zl'"))'^!?^  +  F' ,  (z('"))’‘i5<‘>  t„  >  t  >  t„., 


r-d«  = 


As  was  the  case  in  the  adjoint  formulation  associated  with  ana¬ 
lytic  fixed  point  iteration,  this  algorithm  can  be  expressed  as  a 
compact  form 

_j(k)  =  p'{z(n,))T^(k)  ^(k) ^ 


=_[0F',2(z("'))^(<> 

is  the  residual  of  the  adjoint  at  iteration  level  k. 

At  this  stage,  we  are  in  position  to  derive  an  error  equation  asso¬ 
ciated  with  the  numerical  discretization  of  the  analytic  fixed  point 
iteration.  Let  e*'"’  =y<'">  -  y*'"l  On  time  interval  /,„,  /  =  1,2 . L,  „, 

0  =  (e"") ,  H-  F(zW)’'d<'‘'  dt 

’  *^i,n  J  1  ,n » 1  ,n  J 

-l-jf  (-e''"'H-F(z(^e''">,i5“^>)dt-(- (e('">,t/<'‘))dt,  (26) 

We  note  that 

_g<m)  ^  ^  f  ^('"))) 

=  +  ytm)  _ 

Furthermore,  using  continuity  of  y<'”>, 

^  l-\,n  J  “  l-ln  “  ^  l-],n  J 

Inserting  these  expressions  into  (26)  yields  the  recursive  relation 
-/  (v'"” -F(y'"'>),,5<‘>)dt 

-  +  I  (4'">..5<'>)dt 

-  f  (e""’, //<"))  dr.  (27) 

K 

This  is  the  basis  for  the  equation  for  the  error  at  time  t„  stated  in  the 
following  lemma. 

Lemma  4.2.  The  iterative  multirate  finite  element  method  satisfies  an 
error  equation  aver  one  time  step: 

(ef'-.-An)  =  +<2.,.  +  C)2.n 

1=1  '■  ■'  /=1  ■''in 

where 

Qi.- = -  E  (y'r  -  f  1  (yr.  yr") .  -’f ’)  +  ( [y'r’]  ,.,y I . 

(v'r’-F2(yr’.yr’).’’?’)df+ (M,.,/-’”-.,.)}. 

and  is  defined  analogously  ta  i*'"’- 


Compute  df*  satisfying 
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Proof.  This  is  obtained  by  using  the  recursive  relation  (27).  □ 

We  note  that  this  equation  reflects  the  error  arising  from  the 
consistent  finite  element  numerical  discretization  of  the  analytical 
fixed  point  iteration.  Similar  to  Lemma  4.1,  this  error  contains  the 
iteration  residuals  weighted  by  the  adjoint  iS'''’.  The  last  term  is  not 
computable  since  it  contains  the  error  e^'"’  weighted  by  the  itera¬ 
tion  residual  in  the  adjoint  computation.  Again  provided  that  an 
a  priori  estimate  on  is  available,  we  can  bound  this  term  as 
higher  order  due  the  fact  that  the  residual  can  be  made  as  small 
as  needed  when  the  adjoint  computation  is  driven  to  convergence. 

We  now  collect  all  the  resulting  estimates  and  obtain  an  error 
estimate  of  the  finite  element  multiscale  iterative  implicit  method 
by  setting  y  -  =  (y  _  y('"))  +  (yC")  _  yf-"!). 

Theorem  4.2.  Set  i/'n  =  i/'  and  fot  n  =  N,  N-  1 . 2. 

Then  the  error  of  iterative  muitirate  Gaierkin  finite  eiement  method  at 
finai  time  tn  =  T  can  be  expressed  as 

~  1 '/')  =  +  0.3, n  +  0^,n  -f  Os.n  +  Q6,n)i 

n»l 

(28) 

0,1  „  and  Q2,„  are  given  in  Lemma  4.2  with  m  repiaced  by  M„  and  k  re¬ 
placed  by  K„,  and 

Q5,„  = 

Osp  =  (e<“"),»)(''"))df, 

Proof.  Denote  e<'"'='y-  V''"*.  First  we  need  to  get  the  total  error 
over  one  time  step.  We  setyj,™*,  =y„.,  in  Lemma  4.2  and  combine 
it  with  Lemma  4.1  to  get 

(eJMn)-  1^^)  _  ^ej,“",'  -F  Ql,n  +  Q2,n  +  Qs.n  +  Q4,n 

(29) 

1=1 

We  note  that  since  (see  Algorithm  1),  we  have 

ej,“",’"  =  eJ,“",-''~.  Furthermore,  by  adding  and  subtracting 
weget 

Combining  all  this  expressions  in  (29)  yield  a  recursive  relation  for 
the  total  error  over  one  time  step.  The  error  at  the  final  time  is  ob¬ 
tained  from  undoing  this  relation.  □ 

Theorem  4.2  has  decomposed  the  total  error  at  the  final  time 
into  several  components.  The  term  Qi  „  represents  the  finite  ele¬ 
ment  residual  associated  with  the  fast  time  scale  subsystem,  while 
02, „  represents  the  finite  element  residual  associated  with  the  slow 
time  scale.  The  term  Q3,„  represents  the  iteration  error  quantified 
by  the  iteration  residual  It  is  expected  that  once  convergence 
is  reached  this  component  should  vanish.  The  term  Oap  also  con¬ 


tains  the  iteration  residual,  so  when  convergence  is  reached,  this 
component  vanishes  as  well.  Moreover,  we  note  that  in  this  term, 
the  iteration  residual  is  weighted  by  Recall  that  the 

adjoints  and  cp  differ  in  the  functions  which  are  used  for  linear¬ 
ization.  Thus,  the  term  Q4,„  also  vanishes  when  =  cpl’'’,  which 
may  be  true  if,  for  example,  the  system  (1)  is  linearly  coupled, 
i.e,  if  fi(yi,y2)  =  Ai,yi  +  Ai2y2  for  some  matrix  A,i  and  Ai2. 

The  term  Q5.„  and  Qs,„  contains  AtbJ'Vn  which  is  not  comput¬ 
able.  As  has  been  mentioned,  provided  an  a  priori  estimate  regard¬ 
ing  the  error  of  y  and  Y  is  available,  Qs,„  is  of  higher  order  in  the 
asymptotic  limit.  All  these  issues  are  addressed  in  the  next  section. 

4.2.3.  A  computable  error  estimate 

The  following  lemma  shows  that  if  the  analytic  fixed  point  iter¬ 
ation  has  small  residual,  then  AipjJ'Vn  can  be  written  as  a  sum  of 
the  residuals  of  the  adjoint  iterations  and  some  higher  order  terms. 

Lemma  4.3.  IfP  is  Lipschitz  continuous  andyl“”)  is  sufficiently  close 
to  y  in  1„, 

df  +  h.O.t. 

Proof.  We  denote  by  tp  function  that  satisfy 
-ip  =  F'(y(“"))^^,  te(t„,t„-,), 

=  <!>„■ 

Using  this  equation  we  write  cp  -  =  {tp  -  cp)  +  {cp  -  (^<*”1)  = 

A  +  B,  where  A  satisfies 

-.A  =  Fc^A+[FM-Fp«j]<p,  te(t„,t„_,) 

A(t„)=0.  . 

Using  a  standard  technique  for  system  of  ordinary  differential  equa¬ 
tions,  we  get 

A(C)  =  ^''  [F^-Fp‘^](pdT-FO(At^|PO;)-Fo^ 

If  F  is  Lipschitz  continuous, 

|A(t)|$CAt„|y-yM»)||^|, 

and  thus,  whenyl""*  is  sufficiently  close  toy  in  l„,A(t)  is  of  higher 
order.  Using  (21),  B  satisfies 

f  -B  =  F(y(".))B  -  ;<'"">  te(t„,t„.,), 
lB(f„)  =  0, 

with  solution  expressed  in  similar  fashion  as  A(f), 

B(t)  =  -  J‘"  c:<'^”>  dx  +  0(c<'^")At^) . 

Computing  B(fn_i)  completes  the  proof.  □ 

Once  this  is  in  place,  we  may  verify  that  5Zn=iQ5,n  ^nd  JZLiQs.n 
are  of  higher  order.  These  are  stated  in  the  following  lemma. 

Lemma  4.4.  When  and  are  sufficiently  small,  the  terms 
En^iOs.!,  and  T.Lt06.o  are  of  higher  order. 

Proof.  Using  Lemma  4.3  we  get 

05,4=  /  (yi“">-y‘,"",’,c‘'^"’)df  +  h-o-t. 

Since  |y(“")  -y[,“",'|  <  CAf„,Q5,„  w  Ofcl'^’^Af^)  which  for  sufficiently 
small  C*''"’  makes  3  higher  order  term.  Similariy,  for 

sufficiently  small  and  5Zn.iQ6.n  is  also  a  higher  order  term. 
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because  both  these  residuals  are  weighted  by  the  numerical  solu¬ 
tion  errors.  □ 

Based  on  Theorem  4.2  and  incorporating  the  two  lemmas  above, 
we  may  now  write  a  computable  error  estimator  for  the  iterative 
multirate  Galerkin  finite  element  method. 

Theorem  4.3.  The  computable  error  of  iterative  multirote  Galerkin 
finite  element  method  at  final  time  tN  =  T  is 

“Q1+Q2  +  Q3  +  Q4 

=  5Z(Ql,n  +  Q2,„  +  Q3,„ -I- Q4,n)i  (31) 

nsl 

where 

=  -£  (n“"'  } 

Q3,n=E/  (■if-’.-J'*”’)''' 

1-1  •''l.  '  ' 

■'hi,  '  ' 

Remark  4.1.  Notice  that  Q4J,  contains  (5*,“"’  which  is  an  expression 
that  is  dependent  ony<""),  the  analytic  fixed  point  iteration  solu¬ 
tion  of  (1 ).  By  adding  and  subtracting  4^"’  to  Q4„  we  get 

Q4,n  =  E/ 

+  ((5f"> -4“"’. (32) 

The  second  term  in  the  last  equation  is  higher  order  because  it  in¬ 
volves  the  difference  between  two  residuals. 

4.3.  A  computable  error  estimate  including  projection 

Relaxing  our  assumption  in  Section  4.2  and  allowing  for  a  pro¬ 
jection  other  than  the  identity,  we  may  write  a  computable  error 
estimate  for  the  iterative  multirate  Galerkin  finite  element  method 
including  projection. 

Theorem  4.4.  The  computable  error  of  iterative  multirate  Galerkin 
finite  element  method  including  projection  at  final  time  tn  =  T  is 

(yw  -  ■  iA)  «  Qi  +  Q2  -f-  Q3  -f-  Q4  +  Q„ 

N 

=  +  Q3,n  +  Q.i.n  +  Q;f,n).  (33) 

where 

= -  E  {/  ('^r’  -  F.  + (K"’],. ’ 

Q33,=E  /  (4”"’.'’'""’)* 

/.|  Jhj,  ' 

f.l  'hj, 

=  E  (yf -i.yf ■))  -F2 {nYr,Yr) ,i?«)  dr. 


5.  Numerical  experiments 

In  this  section,  we  present  several  numerical  examples  that 
show  the  performance  of  the  error  estimates.  All  forward  problems 
are  solved  using  the  lowest  order,  piecewise  constant  dG  method, 
which  is  equivalent  to  backward  Euler  scheme.  In  particular,  for 
iteration  level  m,  the  scheme  is 


'It  -  EF2(nY',t.nt). 


U=  1,2,.. .,1.2,4,  Ij  =  (k  -  ■l)d„ +j. 

When  nonlinear,  the  individual  component  equations  are  solved 
using  Newton's  Method.  The  adjoint  solutions  are  computed  using 
a  second  order,  piecewise  linear,  continuous  Galerkin  method, 
which  is  equivalent  to  the  second  order  Crank-Nicolson  scheme. 

In  order  to  illuminate  the  behavior  of  the  error,  we  take  the 
quantity  of  interest  to  be  the  individual  error  in  each  component 
at  the  final  time.  We  point  out  that  the  choice  of  the  quantity  of 
interest  has  a  significant  impact  on  the  behavior  of  the  error  in 
general  |26,23].  This  is  even  more  significant  in  a  multiscale 
problem. 

We'demonstrate  the  robustness  of  the  proposed  error  estimator 
through  several  examples  below.  These  examples  also  show  the 
potential  for  using  an  accurate  estimate  to  adaptively  determine 
the  parameters  controlling  accuracy.  Since  the  error  estimate  is 
written  as  a  sum  of  contributing  components,  we  can  determine 
the  largest  source  of  error  and  adjust  the  corresponding  parameter. 

In  the  first  example  in  Section  5.1,  we  illustrate  the  conse¬ 
quences  of  projections  between  scales.  The  rest  of  the  examples 
illustrate  the  consequences  of  incomplete  iterations,  and  in  those 
examples  we  assume  f?  =  the  identity. 

S.I.  Numerical  example  illustrating  discretization  ond  projection  error 

To  illustrate  the  performance  of  the  error  estimator  provided  by 
Theorem  4.1  (fully  implicit  multirate  Galerkin  finite  element  meth¬ 
od),  we  consider  the  numerical  solution  of  a  3  x  3  system 

'x  =  100y  +  z,  x(0)=S 

.y  =  -100x,  y(0)  =  T^ 

z  =  -j^((10001x-i-z)^-l-(10001y-i-  lOOz)^),  z(0)  =  1000, 


which  has  fast  and  slow  equations  coupled  nonlinearly.  In  particu¬ 
lar,  the  equation  determining  z(t)  contains  nonlinear  coupling  to  the 
fast  scale  components  x(t)  and  y(t).  The  true  solution  is 

x(0=cos(100t) y('')  =  -sin(100t)--i^e-',  and  z(f)  = 

lOOOe”',  There  are  two  distinct  time  scales,  fast  0(27t/100)and  slow 

0(1).  We  set  yi  =[xyl’'’  (associated  with  the  fast  time  scale)  and 
y2=z  (associated  with  the  slow  time  scale).  A  typical  solution  is 
depicted  in  Fig.  3, 

The  multirate  finite  element  solution  is  constructed  on  the 
piecewise  constant  finite  element  space,  i.e.,  with  qi  =  q2  =  0.  The 
system  is  solved  until  T=0.5.  We  use  Af  =  0.5/10,  As,  =  Af/800, 
and  As2  =  At/10  and  examine  three  different  projections;  (i)  f7,: 
the  identity  operator,  (ii)  112:  averaging  over  (f/(-i.n,fM)  ('-e-,  over 
a  subinterval  of  length  AS2),  and  (iii)  Tl^:  averaging  over  (f„-i,f„). 
Fig.  4  compares  the  exact  errors  of  the  multirate  solutions  when 
solved  employing  these  three  different  projections.  As  expected, 
the  multirate  solutions  exhibit  the  best  performance  when  the 
identity  operator  is  used.  In  all  subsequent  examples  we  shall  as¬ 
sume  that  the  projection  is  the  identity  operator  and  thus  Q/i  =  0. 
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Fig.  3.  Typical  solution  of  (34). 


Fig.  4.  Comparison  of  exact  errors  for  multirate  solution  of  (34)  with  various  projection,  n,’  identity  (soiid  lines).  Hi:  local  average  over  Ijj,  (dotted  lines),  and  ny.  average 
over  l„  (dashed  iines). 


Table  1  Table  2 

Performance  of  error  estimator  at  T  -  0.5  for  I7i.  the  identity  operator.  Petfoimance  of  error  estimator  at  T*  0.5  for  flj,  averaging  over  /i,„. 


Error  for  x(f) 

Error  fory(r) 

Error  for  40 

Error  for  x[t) 

Error  for  y(t) 

Error  for  z(t) 

Exact  error 

0.124 

0.S69 

-48.92 

Exact  error 

0.126 

0.665 

-57.31 

Error  estimate 

0.126 

0.S35 

-46  11 

Error  estimate 

0.128 

0.541 

-46.81 

Cl 

0.123 

0.500 

-43.63 

Cl 

0,123 

0.499 

-43.66 

Qa 

0.003 

0.03S 

-2.48 

Cj 

-0.014 

-0.144 

10.79 

Qri 

0 

0 

0 

Qti 

0.019 

0.186 

-13.94 

Tables  1-3  show  the  performance  of  error  estimator  when  solv¬ 
ing  the  system  employing  the  three  different  projections.  As  ex¬ 
pected,  the  estimator  performs  reasonably  well  when  /7i  and  772, 
the  identity  and  local  averaging  operators  respectively,  are  used 
and  it  breaks  down  when  fl},  the  averaging  operator  over  f„.  is 
used.  Nevertheless,  the  estimator  still  gives  a  hint  of  what  is  actu¬ 
ally  happening  in  terms  of  the  relative  size  of  the  projection  error 
Qii  compared  to  the  total  estimated  error  for  all  three  components. 

5,2.  A  one-way  system  with  the  fast  variables  coupled  into  the  slow 
equation 

We  consider  the  3  x  3  system 
X  =  -50y,  x(0)  =  1 

y  =  50x,  y(0)  =  0  (35) 

z  =  -z  +  x+y,  z(0)  =  2, 

in  which  the  fast  variables  are  coupled  into  the  equation  for  the 
slow  subsystem.  The  true  solution  is  x(t)  »=  cos(50t),  y(t)  =  sin(50t), 
and  z(t)  =  52^e-'-j|§iCos(50t)+5|^sin{50t).  We  set  yi  =  [xy]'^ 


Table  3 

Performance  of  error  estimator  at  T-  O  S  for  flj,  averaging  over  l„. 


Error  for  x(t) 

Error  fory(f) 

Error  for  40 

Exact  error 

0.161 

3.77S 

-370.03 

Error  estimate 

0.194 

0.970 

-87.64 

0, 

0.111 

0.S40 

-48.63 

Qt 

0.004 

0.018 

-1.64 

Qn 

0.079 

0.412 

-37.37 

(associated  with  the  fast  time  scale)  and  y2  =  z  (associated  with 
the  slow  time  scale).  Since  the  coupling  is  one  way,  there  is  no  iter¬ 
ation  needed  when  solving  the  system,  i.e.  we  solve  fory,  and  use 
the  solution  to  solve  for  y2.  The  same  holds  for  the  associated  ad¬ 
joint  computation.  Thus,  the  error  arises  solely  from  the  numerical 
solution  of  the  fast  and  slow  subsystems.  Note  however  that  the 
numerical  error  of  the  fast  component  affects  the  accuracy  of  the 
slow  component.  We  plot  the  typical  behavior  of  the  error  in 
Fig.  5.  The  accuracy  of  the  method  deteriorates  for  longer  times, 
however  the  estimator  can  accurately  predict  the  error  dynamics. 
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Fig.  5.  Time  history  of  the  error  for  solving  (35).  Left:  x(f). 
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Fig.  6.  Error  for  component  z(r)  in  (35)  at  T*  1  plotted  against  Asj. 

Only  the  terms  Qi  and  Q2  contribute  to  the  error  estimate  (31), 
and  in  fact,  the  error  is  dominated  by  the  finite  element  residual 
from  the  fast  scale  Qj.  Fig.  6  shows  the  error  in  component  z(t)  at 
the  final  time  7=1  when  the  system  is  solved  using  At  =  0.05, 
As,  =  Af/128,  and  decreasing  AS2.  The  slow  scale  finite  element 
residual  Q2  decreases  linearly  as  AS2  decreases.  On  the  other  hand, 
the  fast  scale  finite  element  residual  Q|  does  not  exhibit  significant 
change.  Apparently,  decreasing  As2  yields  improved  accuracy  only 
until  a  certain  stage,  after  which  the  error  is  dominated  by  the  fast 
scale  residual.  In  terms  of  adaptivity,  this  example  emphasizes  the 
potential  of  the  error  estimator  to  provide  criteria  for  time  step 
refinement  specific  to  the  dominant  error  component. 

5.3.  A  system  with  a  slow  variable  coupled  into  the  fast  equations 

Next,  we  consider  the  3  x  3  system 
x=100y  +  z,  x{0)=a 

y  =  -100x.  y(0)=X  (36) 

z=-z,  z(0)  =  1000, 


10'®  2  5  10"'  2  5  10"*  2  5  10’^  2 

As, 


Time 

Right;  z(f)-  The  time  steps  Af  *  O.OS,  ASj  *  Af/128.  AS2  =  At. 


in  which  the  slow  variable  enters  into  the  fast  equations.  In  fact,  this 
system  has  the  same  solution  as  (34)  in  Section  5.1.  As  in  that  sub¬ 
section,  we  set  y,  =  Ixy]’^  and  y2  =z.  Because  the  slow  scale  equa¬ 
tion  does  not  involve  the  fast  scale  variables,  two  iterations  are 
sufficient  to  reduce  the  iteration  error.  Also,  the  slow  scale  finite 
element  residual  component  of  the  error  Q,  is  zero  for  component 
y2-  For  all  components,  the  iteration  residual  component  d,  is  zero 
because  tbe  adjoints  (p  and  i5  are  equal,  which  is  a  consequence  of 
the  fact  that  ^  is  a  constant  independent  of  the  solution. 

Fig.  7  shows  the  error  components  in  the  fast  scale  components 
x(t)  and  y(t)  as  a  function  of  the  fast  time  step  As,.  Here  (36)  is 
solved  until  T  -  2  with  At  =  0.2  and  As2  =  At/20.  The  method  uses 
only  one  iteration  in  each  of  the  coarse  time  steps  At.  The  slow 
scale  component  z(t)  has  been  solved  accurately  with  error  about 
0.5%.  Moreover,  the  difference  between  estimated  and  exact  errors 
is  about  0.08%.  As  shown  in  the  figure,  the  error  estimator  gives  an 
accurate  prediction  despite  the  inaccuracy  of  the  method.  Each 
component  exhibits  a  different  error  behavior  in  terms  of  the  dom¬ 
inant  component.  For  component  x(t),  the  dominant  component  is 
Qs,  i.e.  the  iteration  error.  Obviously  decreasing  As,  does  not  help 
improving  the  method’s  accuracy.  The  fast  scale  finite  element 
residual  Q,  does  seem  to  decrease  linearly  with  respect  to  A  s,. 
By  contrast,  the  error  in  component  y(t)  is  dominated  by  Q,,  and 
thus  decreasing  As,  would  result  in  smaller  Q,  and  hence  reducing 
the  error  for  this  component.  Moreover,  when  As,  is  sufficiently 
small,  the  contribution  of  error  from  all  components  becomes  rel¬ 
atively  comparable. 

Fig.  8  shows  the  error  components  in  the  fast  scale  components 
x(t)  and  y(t)  when  two  iterations  are  used  to  solve  the  system.  In 
this  case,  the  iteration  error  component  Q3  is  essentially  zero  and 
the  dominant  component  is  Q,  for  both  x(f)  andyft).  As  As,  is  de¬ 
creased,  this  term  decreases  as  does  the  total  error.  Both  compo¬ 
nents  exhibit  similar  behavior  and  the  error  estimator  predicts 
the  exact  error  accurately. 


lO'**  2  5  10'“  2  5  10"*  2  5  10'^  2 


As, 


Fig.  7.  Error  for  components  x(f)  and  y(t)  in  (36)  at  T»  2.  Solutions  are  obtained  using  one  iteration. 
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Fig.  8.  Error  for  components  x(t)  and  y(t)  in  (36)  at  T=  2.  Solutions  are  obtained  using  two  iterations. 
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Fig.  9.  Error  for  components  yi(t)  and  y2(t)  at  F-  I  in  (37)  as  a  function  of  number  of  iterations. 
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Fig.  10.  Time  history  of  error  in  x  for  solving  (34).  Left;  At*0.05,ASi  -  At/1600.  ASa  -  At/32.  Right:  At  -  0.00625.  ASj  -  At/200.  ASz  -  At/4. 


5.4.  A  nonlinearly  coupled  system  with  one  scale 
We  consider 

x  =  ey+ey^-2,  y,(0)  =  -1, 

y  = -ef' -eyt +2,  y2{0)  =  1-  ' 

The  exact  solution  is  yi(t)  =  ln((e  -  1)t  +  1 )  -  ln((e  -  1  )t  +  e),  and 
y2(t)  =  -yi(r)-  system  is  not  multiscaled,  however  the  two  equa¬ 
tions  are  coupled  in  nonlinear  fashion  and  we  can  investigate  the 
behavior  of  the  error  as  the  iterations  increase.  Fig.  9  shows  the 
behavior  of  the  error  over  [0, 1 )  with  At  =  1,  and  Asj  =  As2  =  At/4. 
At  the  first  iteration,  all  error  components  are  relatively  comparable 
to  each  other.  As  the  iterations  increase,  the  components  Qj  and  (1, 
are  reduced.  However,  the  overall  error  fails  to  continue  to  improve 
significantly  as  the  iterations  increase  because  the  error  is  eventually 


dominated  by  the  finite  element  residuals  Qi  and  Q2.  We  can  see  also 
that  in  each  iteration  the  error  estimator  is  in  good  agreement  with 
the  exact  error. 

5.5.  A  nonlinearly  coupled  multiscale  system 

Next,  we  reconsider  the  3x3  system  in  (34)  described  in  Sec¬ 
tion  5.1.  Figs.  10-12  shows  the  time  history  of  the  error  for  the 
three  components.  The  system  is  solved  until  T=2.  The  plots  on 
the  left  are  for  At  =  0.05  and  on  the  right  for  At  =  0.00625.  We 
maintain  the  absolute  size  of  the  other  time  steps,  resulting  in 
As,  =0.05/1600  for  the  left  column  plots,  and  As,  =0.00625/200 
for  the  right  columns  plots.  The  error  in  all  plots  are  shown  for 
the  solutions  obtained  after  convergence  is  reached.  It  requires  5 
iterations  to  reach  convergence  for  the  solution  with  At  =  0.05 
(lefthand  plots),  and  only  2  iterations  for  the  solution  with 
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Fig.  11.  Time  history  of  error  Iny  for  solving  (34).  Left:  At  •  0.05.  AS|  -  Af/1600.  A  s^  ■  Ar/32.  Right:  At  -  0.00625,  AS|  ■  Af/200.  AS2  •  Af/4. 


At  =  0.00625  (righthand  plots).  The  dominant  error  is  always  the 
fast  scale  finite  element  residual  Q|.  The  estimator  predicts  the  er¬ 
ror  with  «2%  difference  for  x(t),  and  ssl  1%  for y(t)  and  z{t). 

5.6.  A  two-scale  system  of  wire  suspended  masses 

We  return  to  the  two-scale  version  of  the  mass-wire  system  (2) 
described  in  the  introduction.  We  set  M=10,  m  =  0.1,  /1  =  0.25. 
a  =  0.1,  and  r  =  y  =0. 

Fig.  13-16  show  the  time  history  of  the  error  of  three  compo¬ 
nents  of  the  solution,  the  slow  component/heavy  mass  at  location 
X,,  the  light  mass  at  X3  (the  so-called  "bridge"  mass)  that  is  con¬ 
nected  to  a  heavy  mass  on  one  side  and  a  light  mass  on  the  other, 
and  the  fast  component/light  mass  at  location  x?.  Even  when  the 
error  is  very  large,  the  error  estimate  gives  an  accurate  picture  of 
the  error.  The  figures  also  indicate  the  dominant  component  in 
each  case.  For  example,  when  only  one  iteration  is  used,  the  itera¬ 
tion  error  is  dominant,  while  when  three  iterations  is  used,  the  fi¬ 
nite  element  residual  is  the  dominating  component. 

6.  Details  of  the  a  priori  analysis 

6.1.  Convergence  of  the  analytic  fixed  point  iteration 

As  with  the  standard  local  analysis  for  ordinary  differential 
equations,  the  solution  of  (9)  is  sought  in  a  neighborhood  of  the  ini¬ 
tial  condition  y(t„_i).  Because  Fe  C'{£),  it  is  locally  Lipschitz  in  £  In 
particular,  for  y{t„_i)6£  there  exists  an  e  neighborhood 
BeCyftn.,))  in  £  and  a  positive  constant  £  such  that 
|F(u)  -  F(a)|  <  £|u-  v\.  for  ail  u  and  v  in  Btfyftn-i)).  In  addition, 
with  b  =  e/2,  the  function  F  is  continuous  and  bounded  with  bound 
M  in  the  compact  setB  =  (u  e  R^,  such  that  |u  -y(f„_i)|  $  b}.  We 
claim  that  the  solution  of  (9)  is  unique  in  B.  It  is  obvious  that  the 


argument  employed  to  achieve  this  is  exactly  the  same  for  each 
integral  equation  because  for  fixed  m,  each  integral  equation  is 
solved  independently  of  each  other.  The  following  lemma  can  be 
applied  appropriately  to  each  of  the  integral  equations  in  (9). 

Lemma  6.1.  Assume  that  (a,  j?)  6  B.  Then  the  integral  equation 

c(t)  =  0L+ f  Fi(i,p)ds,  (38) 

admits  a  unique  solution  with  (c,/?)  e  B. 

Proof.  Part  of  the  proof  closely  follows  standard  arguments  for 
existence  and  uniqueness  (see  [47]).  We  set  =  a  and  compute 

{(')(t)  =  a  +  £  F,(c:“-'>,;8)ds,  (39) 

forj  =  1,2 . For;  =  1.  this  gives  |c'’’(0  -  a|  <  A4(t-  t„^,)  ^  MAt„. 

Then  by  choosing  At„  $  b/M  (and  thus  t„  ^  t„.,  +b/M}  we  have 
(c*‘^/?)€B.  By  induction,  (c^\^)6B.  We  proceed  to  show  that 
the  sequence  (c^')  converges  an  element  of  8.  Using  the  Lipschitz 
condition.  |i:*'’(t)  -  <  (t  -  t„.i)M.  and  induction  gives 

Ic'Wft)  -  c«-')(t)|  ^  ^  ^^{CAtJ, 

for  j  >  2.  As  long  as  At„C  <  1,  we  know  that  for  1  >  k  >  N 

|c:">(0  - c'‘>(t)l  ^  t  lc«(t)  -  f'-'-'HOI  ^  T 

By  choosing  At„  <  min(b/W(,  !/£},  |c<'’(t)  -  c'''>(t)|  vanishes  as 
IV  00.  This  implies  that  ^'^’(t)  is  a  Cauchy  sequence  of  continuous 
functions  in  l„  =  lt„_i,t„)  which  converges  uniformly  to  an  element 
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Fig.  16.  Time  history  of  error  for  solving  (2)  using  three  iterations.  Left:  M,  (slow).  Middle:  mj  ("bridge”).  Right:  m7  (fast).  Q3  and  Q4  are  essentially  zero  and  are  not  shown. 
Time  steps:  Af  •  0.02,  ASi  -  Ar/32,  Asj  -  Af/8. 
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in  C(/n).  We  pass  to  the  limit  in  (39),  so  that  this  limit  satisfies  (38). 
The  uniqueness  of  this  limit  is  established  by  contradiction  using 
At„C  <1.0 

Now  we  can  use  this  lemma  to  prove  Theorem  3.1. 

Proof  of  Theorem  3.1.  As  in  Algorithm  2.  yf^  =y,(t„-i)  and 
y*®' =y2(fn-i)-  We  choose  Ar„  <  min{b/W(,  1//:},  where  all  the 
constants  are  as  in  the  paragraph  preceeding  Lemma  6.1.  The 
existence  of  the  sequences  are  established  by  repeated  application 
of  Lemma  6.1.  Form  =  1,  we  designate  a  =yi(tn.|)  and  ^  =  y!J'’.  Then 
by  Lemma  6.1,  the  integral  equation  governingyj''  admits  a  unique 

solution  with  ^  S.  Similarly,  witha  =y2(tn-i)and^ 

Lemma  6.1  guarantees  thaty^''  is  unique  with  €B.  We 

can  repeatedly  apply  this  lemma  and  use  induction  argument  to 

show  that  the  sequences  (>'*'"',3/2'"’)  eB.  Our  next  task  is  to 
establish  the  convergence  of  the  sequences.  Note  that 

|yl'’(f)  -yf  (Oj  =  -y2(t„-,)|  <  Ai(t-t„-,)  <  AiAt„. 

Then  by  adding  and  subtracting  Fi^yl'^.y^'')  and  applying  the 
Lipschitz  condition  for  F 

KHt)-yr‘(0|</'  (|F.(yf',yy')-F,(y<;',y^'')| 

<  |y«(s)  -y<;>(s)|ds  +i 

Setting  z„  =  £Atn  exp(£Afn),  we  apply  a  Gronwall's  inequality  to 
obtain, 

\yf\t)  -y<;>(t)|  < \MCit  -  t„,,)^  exp(£At„)  < 

Similarly, 

l^"’(^)-^'’(^)hZexpT£Ag^"- 

By  induction, 

lyr(0  -y'r‘'’(Ol  «  C„c  and  |y<2'">(0  -y^'"-'>(t)|  <  C„t- 

where  C„  =  W(/(£exp(£Af„)).  As  long  as  !„  <  1,  we  know  that  for 
l>k>N 

E  krV) -yr”(t)l 

1 

m=N'  '  ‘ 

In  other  words,  enforcing  t„  <  1  insures  thaty*'"’(t)  is  a  Cauchy  se¬ 
quence  of  continuous  function  that  converges  uniformly  to  an  ele¬ 
ment  in  0(/„).  This  is  also  true  for  y^""’.  We  pass  to  the  limit  in  (9), 
so  that  this  limit  satisfies  (1)  in  !„■  Uniqueness  is  again  established 
by  contradiction.  □ 

6.2.  Convergence  analysis  for  Che  iterative  multirate  Calerkin  finite 
element  method 

The  errors  are  denoted  by  ej.'"’  =  yj"’’  -  V'j'"’  for  i  =  1 , 2.  It  is  obvi¬ 
ous  from  the  description  of  the  method  that  the  finite  element  solu¬ 
tion  is  a  consistent  numerical  discretization  of  the  differential 
equations  governing  y*'"'.  Thus  intuitively  we  expect  that  the  error 
resulting  from  this  discretization  can  be  bounded  by  some  power  of 


the  time  steps  As.-.n.  Standard  analysis  of  time  discontinuous  finite 
element  for  solving  system  of  ordinary  differential  equations  have 
been  performed  by  many  authors,  see  for  example  [16,17,35,22]. 
In  general,  one  initiates  a  local  analysis  within  a  time  sub-interval 
under  the  assumption  that  the  initial  condition  in  this  interval  is  ex¬ 
act.  Then  some  form  of  recursive  formulae  is  derived  which  is  used 
to  accumulate  the  contribution  from  each  time  sub-interval  to  yield 
an  error  estimate  at  the  final  time. 

Similar  arguments  can  be  employed  to  bound  the  errors  of  V'*,'"’ 
and  separately.  However,  there  are  complications  that  need  to 
be  addressed  appropriately.  Firstly,  one  or  both  of  the  differential 
equations  may  be  solved  by  lagging  one  component  an  iterate  be¬ 
hind  the  current  one,  e.g.  in  Algorithms  1  and  2,  we  have  chosen  to 
lag  y!;”"'’.  Secondly,  the  dependence  of  function  Fon  both  Y["'^  and 
y^""*  dictates  that  the  accuracy  of  one  component  affects  the  other. 

This  is  reflected  in  the  following  two  lemmas.  Lemma  6.2  com¬ 
pares  the  numerical  solution  of  component  y^'"’  in  time  interval  /„ 
with  a  similar  finite  element  solution  with  exact  initial  condition 
at  f„_i  (i.e.,  it  is  equal  to  y^'"’(r„_i)).  As  expected,  this  comparison 
depends  upon  both  the  error  in  the  initial  condition  at  t„_i  and 
on  the  error  in  the  approximation  y*'"'.  Accumulation  of  local  error 

in  each  lt„  with  k  =  1 . L2,„  gives  the  quantification  of  accuracy  of 

component  y*,'"’.  Furthermore,  with  slight  modification  of  the  proof 
to  take  account  of  the  fact  that  the  approximate  solution  y2  is 
known  at  the  previous,  rather  than  the  current  iteration  (at 
m  -  1  rather  than  m).  a  similar  estimate  is  also  true  for  y*,'"’  which 
we  state  in  Lemma  6.3.  In  what  follows,  we  set  |u|,^^  =  sup,£,|^|u(t)|, 
and  similarly  for 

Lemma  6.2.  Let  Z  6  Vl'*^*(/n)  satisfy 

(2-F2(y'rU),w)df+([4.,,„,wj.,)  =  o  (40) 

for  all  W  €  P'’^>('k,").  fe  =  1.2 . L2,„.  and  z(to„)  =yi""(t„-,).  With 

d>  =  z-  yf 

<  10exp(24£Af„)(|e<2'"'(t;.,)f  +5^|er>Q, 
for  sufficiently  small  As2.„,  where  |e', =  maxuntj^lle','"’],^^}. 

Proof.  We  know  that  for  0  e  V^'’‘\lk.r,)  with  q2  =  0,  1, 

=  max  ^  \4>Uf  +  (41) 

Subtracting  (6)  from  (40)  gives 

I  (0+F2(y'r>.yf>) - F2[yT\z),w)dt+  =  o 

(42) 

for  every  W  e  P*’'' (/.,,„)•  With  W=  0  in  (42)  and  using  the  Lipschitz 
condition  of  F,  we  find  that 


This  in  turn  gives 

(43) 

^  (3101^  +  lerV)  dr. 

(44) 

Using  W  =  (t  -  tt_i,„)0  in  (42)  and  estimating  yield 


iAs|„|0N£^^^  (|0p  +  |e<'">p)(t-tk.,,„)t/t 
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from  which  we  get 

IasIMI,  ^  +  |e(;>Q  dt. 

Because  \(t>\^dt  <  2As2,„|<^t„P  combining  this  with 

the  last  inequality  yields 

^1-|(2£AS2,„)")^  \<t>\^dt^2AS2M;/  +^(2CAS2,„f  (ef>|'dt. 

Provided  that  -|(2£AS2,„)^^  >  2/3  (which  is  equivalent  to 
having  (2£As2,n)^  <  1/2),  we  get 

f  \d>\'dt<3AS2„\d>:f  +  {2CAS2j,)^f  le'^fdf 
v/m  Jk,  '  ' 

<3AS2,n\<t>;/  +  ll^  (45) 

Substitution  of  (45)  to  (44)  gives 

+  ^CAS2.n\<l>:,n\‘  +%j  krf  df- 

Provided  that  1  -9£As2,n  >  1/10  (which  is  equivalent  to  having 
CAS2,n  <  1/10),  then  this  last  inequality  yields 

^  eXp(24£AS2,„)  df  j  . 

We  can  now  undo  the  recursive  relation  to  get 

<  exp(24/:fcAS2,„) 


for  k  =  1,2 . f.2,n'  Furthermore,  using  (41)  in  (43)  and  estimating 

we  get 

<  4|0(f,-.,,„)p  +  2cJ^^  +  |erf )  dt, 

which  gives 

(1  -  62:AS2,„)|.^|f^^  <  +  2c[  le'^f  df. 

and  thus 

l<<10|<^k.,.„l'  +  52:|^  |er|'dr, 

Using  (46)  with  k  =  k  -  1  in  the  last  inequality,  we  get 
Wl  ^  10exp(24£(fc -  1)A52,„) +T S  i 
+  5z:j|'^  |e‘,'"'pdt 

<  1 0exp(24z:(fc -  1  )AS2,„) ^|ef > leff  drj 

<  10exp(24z:At„)(|e<'">(t-.,)l'  +^|f'rQ  O 

Lemma  6.3.  Let  X  e  V*''''(/„)  satisfy 

I  (X  -  f ,  (X.yr-") .  df  +  K,)  =  0  (47) 


for  all  V'€P<’'>(/,,„),  /=  1,2... and  x(t„-„)  =yr(t-.-i)-  With 
c  =  X-V''">. 

\cl  <  i0exp(24£Af„)(|e,(f„-.,)l^  +^|e2lf„  +^|rr>Q 

for  sufficiently  small  Asi  „,  where  r^'"^  = 


Proof.  This  is  obtained  using  the  same  argument  as  in 
Lemma  6.2.  □ 

The  two  lemmas  above  are  true  for  any  iteration  m  within  a 
time  interval  /„.  Not  only  is  it  apparent  that  the  accuracy  of  one 
component  affects  the  other,  but  as  stated  in  Lemma  6.3,  the  error 
also  depends  upon  the  accuracy  of  the  previous  iteration.  Thus,  in 
addition  to  the  numerical  discretization,  the  iteration  residuals 
would  also  influence  the  accuracy  of  the  overall  solution.  The  .fol¬ 
lowing  lemma  states  this  fact  about  the  error  ei"’)  =  j^e','"'  e^"'^J  . 

Lemma  6.4.  For  sufficiently  small  At„,  the  error  of  the  finite  element 
solutions  at  iteration  level  m  over  time  interval  l„  satisfies 

|e''")(f„-)p  <  exp(CAf„)(|e"")(f„--,)l'  +  ArS(CtAs?;r”  +C2As^[’="'>) 

-l-C3Ar„|r<2"'’|^J. 

Proof.  Given  the  finite  element  solutions  V'*,'"’  and  over  time 
interval  we  use  X  and  Z  in  Lemmas  6.2  and  6.3,  to  write 

er'  =  (yr-x)-Lc  and  ef’ =  (yf’ "  z)  +  ti, 

where  {  and  ^  are  as  in  those  two  lemmas.  Moreover,  it  has  been 
established  (117,35.22))  that 

ly',"”  -  X[^  <  CAf„A5’;;'  and  |yf  >  -  x|^^  <  CAf„As’;;' , 

Using  Lemmas  6.2  and  6.3,  we  get 


^  CArSA5?;r’>  +  C  At^As^'’'^” 

+  10exp(24GAt„)(|e<;>{f„-.,)|'+5 
+  -I-  10exp(24£Af„) 

X  (h2"”{f.--,)r+^|er’Q- 


2  5£Af„ 

'*'~2~^  I. 


Arguing  as  in  Lemma  6.2,  for  sufficiently  small  A  t„.  there  are  con¬ 
stants  C,  C],  C2,  and  Cj  such  that 

<  exp(CAf„)(|e''")(f„-.,)|' 

+  +  CzAs^S^"”)  +  C3Af„|r2|?„). 

Because  |e''">{t;;)p  $  inequality  gives  the 

desired  estimate.  □ 

Finally,  we  have  the  proof  of  Theorem  3.2. 

Proof.  Lemma  6.4  in  Section  6  is  a  recursive  relation  for  the  error 
within  time  interval  /„.  With  m  =  M„,  this  recursive  relation  is 
unwound  to  obtain  the  error  estimate  at  the  final  time  tw  =  T.  □ 
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7.  Summary 

In  this  paper,  we  carry  out  an  a  priori  analysis  and  derive  a  hy¬ 
brid  a  posteriori  -  a  priori  error  estimate  for  a  muftirate  numerical 
method  for  an  ordinary  differential  equation  that  presents  signifi¬ 
cantly  different  scales  within  the  components  of  the  model.  We 
formulate  an  iterative  multirate  Galerkin  finite  element  method 
then  employ  adjoint  operators  and  variational  analysis.  The  a  priori 
analysis  uses  the  fact  that  iterative  multirate  Galerkin  finite  ele¬ 
ment  method  is  a  consistent  approximation  of  the  analytic  fixed 
point  iteration  we  construct.  The  hybrid  estimate  has  the  form  of 
a  computable  leading  order  expression  plus  uncomputable  quanti¬ 
ties  that  are  provably  higher  order  in  an  asymptotic  sense.  These 
higher  order  terms  vanish  when  the  convergence  in  both  the  solu¬ 
tion  and  adjoint  are  reached.  The  computable  expression  repre¬ 
sents  the  error  in  terms  of  contributions  from  the  numerical 
error  arising  in  the  solution  of  each  component,  the  iteration  error, 
and  the  error  in  the  adjoint  arising  from  the  analytic  fixed  point 
iteration.  The  a  posteriori  analysis  takes  into  account  the  fact  that 
the  original  problem  and  an  analytic  fixed  point  iteration  are  asso¬ 
ciated  with  different  adjoint  problems.  We  conclude  with  some 
examples  that  demonstrate  the  accuracy  of  the  computable  parts 
of  the  hybrid  a  posteriori  -  a  prion  estimate. 
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SUMMARY 

In  this  paper,  we  develop  an  ct  posteriori  error  analysis  for  operator  decomposition  iteration  methods 
applied  to  systems  of  coupled  semilinear  elliptic  problems.  The  goal  is  to  compute  accurate  error  estimates 
that  account  for  the  combined  effects  arising  from  numerical  approximation  (discretization)  and  operator 
decomposition  iteration.  In  an  earlier  paper  [1],  we  considered  “triangular”  systems  that  can  be  solved 
without  iteration.  In  contrast,  operator  decomposition  iterative  methods  for  fully  coupled  systems  involve 
an  iterative  solution  technique.  We  construct  an  error  estimate  for  the  numerical  approximation  error  that 
specifically  addresses  the  propagation  of  error  between  iterates  and  provide  a  computable  estimate  for  the 
iteration  error  arising  due  to  the  decomposition  of  the  operator.  Finally,  we  extend  the  adaptive  discretization 
strategy  in  [  1  ]  to  systematically  reduce  the  discretization  error.  Copyright  ©  2010  John  Wiley  &  Sons,  Ltd. 
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1.  INTRODUCTION 

We  develop  an  a  posteriori  error  analysis  framework  for  operator  decomposition  iteration  methods 
applied  to  systems  of  coupled  semilinear  elliptic  problems  of  the  form, 

£l{x,Ui,Dui,D'^Ui)  =  fi{x,Ui,U2,Du2,U3,Du3,-  -  ,Un,DUn),  X  G  0, 
C2{x,U2,Du2,D'^U2)  =  f2ix,Ui,DU\,U2,U3,Du3,  -  ,Un,DUn),  X  G  fl, 

(1.1) 

Hji  (x,  Ufj,  Dllji^  D  =  f  ji(.X j  Ui  f  Du\ ,  U2,  Dtl2^  *  *  *  i  — l )  —  1  j  t^n))  X  G  0, 

where  Dv  and  D'^v  are  the  first  and  second  order  derivative  operators,  {Ci,i  =  I,  -  -  ,n},  is 
a  collection  of  linear  uniformly  coercive,  elliptic  differential  operators,  {fi,i  =  I,  -  -  ,n}  is  a 
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collection  of  differentiable  functions,  fl  is  a  convex  polygonal  domain  with  boundary  dQ,  and  (1.1) 
is  supplied  with  suitable  boundary  conditions  on  dCl.  We  note  that  the  coupling  in  the  system  occurs 
through  the  right-hand-sides  only.  We  assume  that  ( I .  I )  satisfies  suitable  conditions  to  guarantee  a 
solution  in  weak  form,  e.g.  generic  conditions  involve  a  uniform  bound  on  the  derivatives 

of  f.  An  extension  of  our  analysis  to  fully  nonlinear  elliptic  systems  is  straightforward  but  tedious 
in  detail,  e.g.  involving  a  messy  linearization  of  the  diffusion  operator. 

Interest  in  coupled  physics  problems  and  their  solution  arises  in  many  fields.  The  Oregonator 
model  for  the  Belousov-Zhabotinsky  reaction  system, 


e 


dui 

dt 


du2 

dt 


eDiV'ui 


U\-u\-  fu' 


ui-q 
ui  +q' 


Ui  -  U2, 


is  an  example  of  an  important  time-dependent  coupled  semilinear  system.  In  order  to  consider  waves 
traveling  with  permanent  form  with  velocity  c  in  the  x-direction,  we  make  the  ansatz,  Ui{t,  x,  y)  = 
y))  *  =  1)  2,  where  q  =  x  —  ct.  Upon  substitution  we  obtain  the  stationary  coupled  semilinear 
elliptic  system, 

_ 2  du\  9  q 

-e£>iV;ui  =  €c-^  +  ui  -  uf  -  fu2-^ — , 

oq  ui+q 

rs  r-,2  9u2 

-D2V;U2  =  C—+UI-U2, 

where  Vj(-)  =  d‘^{-)/dq'^  +  d'^{-)/dy‘^ . 

In  many  practical  situations,  coupled  systems  of  partial  differential  equations  are  decomposed  into 
individual  physics  components,  each  of  which  is  solved  with  a  code  specialized  to  the  particular 
type  of  physics,  while  the  solution  is  obtained  by  various  forms  of  iteration  and/or  operator 
decomposition.  Such  approaches  introduce  new  forms  of  instability  and  new  sources  of  error  that 
must  be  included  in  an  a  posteriori  error  analysis,  see  [2]  for  an  overview. 

In  this  paper,  we  assume  that  the  system  ( I .  I )  is  solved  by  an  operator  decomposition  approach 
that  involves  iteratively  solving  for  u;,  i  =  1, . . .  n,  the  ordered  sequence  of  problems 

ti  j,  JDuif  D  ii^)  f Ui ,  Du  I ,  t/2 )  Du2  ,  •  *  *  )  Uj— 1 ,  Dui~^  \ ,  tij,  j ,  Du^^i ,  ■  *  *  ,  , 


which  are  obtained  by  substituting  solutions  Uj  for  j  computed  in  a  previous  step  in  the 
equations  in  (1.1)  and  then  solving  for  m.  Theoretically,  the  sequence  is  iterated  until  convergence 
(if  it  does  converge),  while  in  practice,  a  finite  number  of  iterations  is  used. 

In  [I],  we  present  an  a  posteriori  error  analysis  for  operator  decomposition  methods  applied  to 
systems  of  elliptic  equations  that  had  an  “upper  triangular”  form,  so  that  one  iteration  through  the 
system  produces  the  solution.  That  analysis  accounts  for: 

•  errors  arising  from  the  discretization  of  each  component  elliptic  problem, 

•  the  transfer  of  error  between  the  component  elliptic  problems, 

•  errors  resulting  from  using  different  discretizations  for  the  component  elliptic  problems. 

In  a  fully  coupled  system  (1.1),  we  need  to  iterate  through  the  components  until,  hopefully,  the 
iteration  converges.  In  addition  to  the  errors  affecting  the  solution  of  a  triangular  system,  the  a 
posteriori  error  analysis  now  must  also  account  for: 

•  the  effects  of  finite  iteration, 

•  the  transfer  of  errors  between  iterations. 


These  two  effects  are  the  focus  of  the  analysis  in  this  paper.  The  results  in  this  paper  can  be  combined 
with  the  full  analysis  of  [  I  ]  to  treat  all  five  of  these  effects  in  one  estimate. 

We  let  be  a  numerical  approximation  obtained  by  iterating  numerical  discretizations  of  the 
differential  equations  k  times.  To  carry  out  the  a  posteriori  error  analysis,  we  decompose  the  error 
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as 

u -{/<*•■>=  =£{*•■}  (1,2) 

analytic  iteration  error  numerical  error 

where  is  the  analytic  solution  obtained  at  iteration  k  by  solving  the  sequence  of  iterated 
differential  equations  exactly.  We  estimate  these  two  components  separately.  This  decomposition 
is  motivated  by  the  observation  that  the .  iterative  discrete  approximation  is  a  consistent  numerical 
solution  of  the  analytic  iterative  problem.  One  consequence  is  that  this  simplifies  the  definition 
of  an  appropriate  adjoint  for  the  error  analysis.  Namely,  we  use  the  adjoint  associated  with  the 
solution  operator  of  the  sequence  of  iterated  component  problems.  This  is  a  complex  operator  that 
can  itself  be  defined  through  an  iterated  sequence  of  adjoint  problems  to  the  individual  components. 
In  practical  terms,  we  can  compute  the  resulting  a  posteriori  error  estimate  without  forming  and 
solving  the  adjoint  for  the  fully  coupled  system.  Solving  the  full  adjoint  of  the  coupled  system  is 
computationally  impractical  in  situations  in  which  operator  decomposition  iteration  is  used  to  solve 
the  forward  problem. 

The  main  focus  of  this  paper  is  a  posteriori  error  estimation  of  the  numerical  error  At  the 
fcth  step  of  an  iterative  solution  process  and  for  a  given  quantity  of  interest,  the  analysis  accounts 
for: 


•  the  numerical  errors  made  at  the  current  iteration, 

•  the  numerical  errors  made  at  all  previous  iterations, 

•  the  error  due  to  the  iterative  approximation. 

For  clarity  of  exposition,  we  do  not  include  treatment  of  the  errors  arising  from  the  use  of  different 
discretizations  for  different  components.  Such  effects  are  already  treated  in  the  earlier  paper  [  I  ]  and 
those  results  can  be  combined  with  the  results  in  this  paper  in  a  straightforward  way. 

To  obtain  a  full  a  posteriori  eaor  estimate  of  the  error  u  —  we  have  to  also  estimate  the 
analytic  iteration  error  The  difficulty  is  that  this  involves  the  true  solution  and  an  “iterative” 
true  solution,  both  of  which  are  unknown.  However,  for  a  fixed  space  mesh,  we  can  adapt  the 
classical  asymptotic  estimator  for  the  error  in  an  iterative  approximation  to  this  situation.  Lacking 
such  an  estimate,  the  a  posteriori  error  estimate  of  provides  an  estimate  of  the  full  error’ 
provided  the  numerical  error  dominates  the  iteration  error,  i.e.  in  the  limit  of  increasing  iterations. 

When  we  have  estimates  for  8^'^^  and  we  can  then  derive  a  generalized  adaptive  algorithm 
that  adjusts  both  the  number  of  iterations  and  the  level  of  discretization  in  each  component  to  achieve 
a  desired  accuracy  with  relative  computational  efficiency. 

This  work  can  be  differentiated  from  a  number  of  previous  analyses  of  nonlinear  problems,  e.g. 
see  references  in  [.3-9],  in  concentrating  on  new  issues  arising  in  fully  coupled  systems  and  dealing 
explicitly  with  the  effects  of  operator  decomposition  and  finite  iteration.  Ignoring  the  coupling 
involved  in  treating  a  system,  e.g.  simply  estimating  the  error  in  each  component  in  isolation,  can 
lead  to  catastrophic  failure  of  the  error  estimates.  See  Example  3.2  of  [  I  ]  and  Example  4. 1  below. 
The  alternative  approach,  which  uses  the  adjoint  of  the  fully  coupled  semilinear  elliptic  system, 
provides  a  valid  error  estimate  (up  to  linearization  error),  but  fails  to  differentiate  the  sources  of 
error. 

The  outhne  of  the  paper  is  as  follows.  To  explain  the  ideas  behind  the  definition  of  an  appropriate 
adjoint  operator  for  the  iterated  system  and  the  a  posteriori  error  analysis,  we  begin  by  deriving 
an  a  posteriori  error  estimate  for  the  iterated  solution  of  a  finite  dimensional  algebraic  systems  in 
Section  2.  This  derivation  contains  the  main  ideas  without  the  complications  of  differential  equation 
discretization  errors  and  moreover  makes  it  easy  to  construct  several  illuminating  examples.  We 
then  turn  to  the  analysis  of  coupled  semilinear  elliptic  problems  in  Section  3.  We  present  results 
for  three  particular  classes  of  operator  decomposition  iteration  techniques,  namely  block  Jacobi, 
block  Gauss-Seidel  and  relaxed  block  Jacobi.  In  Section  4,  we  give  numerical  examples  of  different 
aspects  of  the  error  estimation  framework,  concluding  with  an  adaptive  algorithm  that  adaptively 
refines  both  the  computational  mesh  and  the  operator  decomposition  iteration  to  converge  to  an 
accurate  solution. 
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2.  PRELIMINARY  EXAMPLE;  ITERATIVE  SOLUTION  OF  ALGEBRAIC  SYSTEMS 

In  order  to  explain  the  construction  of  an  appropriate  adjoint  operator  for  an  iterated  solution 
approach  and  the  idea  that  the  numerical  error  of  the  current  iterate  is  affected  by  errors  introduced 
at  all  previous  iterations,  we  first  present  the  analysis  in  the  context  of  finite  dimensional  algebraic 
systems.  We  begin  with  a  linear  problem  and  then  treat  a  nonlinear  problem. 

2. 1.  Estimating  the  numerical  error  for  linear  algebraic  systems 
We  consider  the  solution  of 

Aw  —  b. 

We  construct  an  iterative  solution  method  using  a  matrix  decomposition  of  A  of  the  form 

A  =  D  +  C, 

where  we  assume  that  D  is  invertible.  The  solution  procedure  uses  the  observation  that 

(D  +  C)w  =  b  =>  Dw  —  b—  Cw 


and  solves 

=0, 

=  b-  i  =  l,2,...,k 

We  assume  that  the  iterative  scheme  converges,  which  depends  on  the  spectral  radius  of  D"'C. 

At  each  stage  of  the  iterative  process,  we  compute  a  numerical  solution  Our  goal 

is  to  estimate  the  error  in  a  quantity  of  interest  that  is  representable  as  a  linear  functional  of  the 
solution,  i.e.,  a  quantity  of  interest  of  the  form  {w,xj}),  at  any  iterate  i.  We  write  this  error  as 

' - ^ - -  ' - - - 

analytic  iteration  error  numerical  error 

Here,  a  superscript  in  braces  1*^  indicates  variables  corresponding  to  forward  iteration  i.  Let  k  be 
the  total  number  of  forward  iterations  performed.  Later,  we  use  a  superscript  in  square  brackets 
to  denote  the  adjoint  problem  corresponding  to  forward  iteration  i  =  k  —  j  +  1. 

Finally,  we  introduce  the  notation 


to  denote  the  residual  of  equation  i,  at  iteration  j,  weighted  by  the  fcth  adjoint  problem,  evaluated 
using  the  solution  generated  at  iteration  j  —  1. 

2.1.1.  Estimation  of  the  numerical  error  Because  the  previous  iterate  enters  as  data,  the 
numerical  error  depends  not  only  upon  the  numerical  error  made  at  the  i'*’  iteration,  but  also 
on  the  numerical  errors  made  at  previous  iterations.  Hence,  we  need  to  estimate  the  effects  of  these 
“inherited  errors”. 

Given  =  0,  we  compute  the  sequence  of  approximate  solutions  of  (2.1),  i  =  1, . . . ,  fc, 
as 

DlVt')  =  (2.2) 

Theorem  2. 1 

The  error  in  the  quantity  of  interest  can  be  estimated  as, 

k 

(2.3) 

j=\ 
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where  the  adjoint  problems  are  defined, 

=  rj),  (2.4) 

=  -CT,^b-'l,  j=2,...,k,  (2.5) 

and 

=  {b-CVV^"‘^  (2.6) 

Note  that  the  adjoint  problems  (2.4)-(2.5)  involve  the  simpler  matrix  associated  with  the  iterative 
scheme,  not  the  full  adjoint  of  the  original  matrix  A.  To  emphasize  the  role  of  the  error  in  the  most 
recent  (fcth)  iteration,  we  may  write 

(2.7) 

i=2 

Also  note  that  the  sequence  of  coupled  adjoint  problems  (2.5)  yield  the  natural  adjoint  to  the  solution 
operator  associated  with  the  iterative  method.  We  emphasize  this  point  by  numbering  the  adjoint 
problems  in  the  reverse  order  to  the  forward  iteration  number. 

Proof 

The  estimates  of  the  numerical  error  after  k  iterations  is 

=  -(C(ta<''-i) (2.8) 

The  first  term  on  the  right  of  (2.8)  is  the  usual  error  estimate  depending  on  the  computable  residual 
of  and  the  corresponding  adjoint  solution.  The  second  term  on  the  right  of  (2.8)  estimates  the 
contribution  to  the  error  of  resulting  from  the  error  inherited  from  the  approximation 
to  This  inherited  error  term  can  be  expressed  as  the  error  in  a  new  quantity  of  interek  at 

the  previous  iteration  by  noting  that 

To  estimate  the  error  in  this  new  quantity  of  interest,  we  solve  the  new  adjoint  problem 

to  obtain 

(e{'=-i}^_CT^(il)  =  =  (xylfc-’l  - 

=  {b-CW^^-^^  ), ,/.P1) 

Continuing  in  this  fashion,  we  see  that  the  desired  estimate  of  the  error  in  the  quantity  of  interest 
(e^*^,^)  after  k  iterations  requires  the  solution  of  (2.4)  and  the  recursive  solution  of  an  ordered 
sequence  of  [k  —  1)  adjoint  problems 

=  _cT^b-i|,  j  =  2,...,k, 
and 

k 


Copyright  ©  2010  John  Wiley  &  Sons,  Ltd. 
Prepared  using  nmeauth.cis 


lilt.  J.  Numer.  Meth.  Engng  (2010) 
DOI:  10.1002/nme 


6 


V.  CAREY,  ETAL 


2. 1.2.  The  decay  of  influence  of  contributions  from  early  iterations.  To  estimate  the  numerical  error 
in  the  quantity  of  interest  after  k  iterations,  we  nominally  need  to  solve  k  adjoint  problems.  (One 
of  the  form  (2.4)  and  fc  —  1  of  the  form  (2.5)).  Each  of  these  is  solvable,  but  the  number  can  be 
significant.  The  sequence  of  adjoint  problems  has  the  form 

^b'+i]  =  _(D-Tc'^)r0('l,  j  =  l,...,k-l. 

By  noting  that 

D^^(CD-')D  =  D-*C, 

we  see  that  D“^C,  CD“'  and  all  have  the  same  spectral  radius.  For  the  forward  iteration 

to  converge  the  matrix  product  D"'  C  must  have  a  spectral  radius  smaller  than  one.  This  means  that 
we  expect  that 

+  ^  _p_TQTyn-i^ll|  q 

for  fixed  i  as  rn  — >  oo.  This  suggests  that  we  might  obtain  a  reasonable  approximation 

i 

(2.9) 

i=t 

for  some  small  Z  >  1.  The  more  rapid  the  convergence  of  the  forward  iteration  (the  smaller  the 
spectral  radius  of  D~^C),  the  smaller  the  value  of  I  required. 


2.1.3.  Estimating  the  iterative  error  We  begin  with  a  classic  estimate  for  the  iteration  error 
based  on  extrapolation.  We  define  J  (•)  =  (•,^)  and  denote  =  w  —  Assuming  a  linearly 
convergent  sequence  J  the  classic  asymptotic  argument  yields. 


J  (ryf*})  —  J  -  [j  —  J  ) 


This  estimate  is  not  directly  computable,  however,  the  righthand  side  can  be  further  estimated  as. 


3  « 

(v7  -  J  +  J  -  J 

~  -23  (et'^-'l)  +  j(e{fc-2})  +  _2J  +  J  (VF(fc-2})  ' 

(2.10) 

The  expression  (2.10)  can  be  estimated  using  the  a  posteriori  error  estimate,  but  this  is  expensive. 
We  may  derive  another  estimate  by  assuming  the  numerical  error  is  higher-order  than  the  iteration 


error,  to  obtain 


-{3  y 

3  -  23  (fFf*-*})  +  3 


The  approximation  (2.11)  may,  however,  lead  to  inaccurate  estimates  when  the  iteration  and 
numerical  errors  are  of  comparable  size. 


2.1.4.  Numerical  example.  We  construct  a  diagonally  dominant,  symmetric  20  x  20  matrix  A  = 
201  -t-  R  where  R  is  a  random  20  x  20  matrix  with  Frobenius  norm  less  than  or  equal  to  one. 
We  decompose  A  into  two  matrices  D  and  C,  A  =  D  -f  C,  where  D  contains  only  the  diagonal 
entries  of  A.  We  then  solve  Au  —  b  via  operator  decomposition  iteration  with  the  iteration  given 
by  (2.1)  where  the  solution  at  each  iteration  is  obtained  using  conjugate  gradient  solver  with  a 
tolerance  of  10““*.  The  iteration  terminates  when  <  10“®  which  is  accomplished 

in  1 1  iterations.  We  then  solve  (to  within  round-off)  the  sequence  of  11  adjoint  problems  defined 
recursively  by  (2.4)  and  (2.5)  for  quantity  of  interest  w  =  1^20,  i.e.,  with  =  ip  =  020  (where 
020  is  the  unit  vector  with  a  single  non-zero  entry  in  the  20th  row)and  compute  each  of  the  terms 
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Total  Error 

Iteration  Error 

Numerical  Error 

Primary  Numerical  Error 

1^20  - 

\w  — 

1  7.43823  X  10"® 

2.2234  X  10-“ 

7.4382  X  10-^ 

9.465  X  10-5  1 

Table  2. 1 .  Error  components  for  for  example  2.1.4. 


in  the  error  representation  formula  (2.3)  for  A:  =  1, ... ,  11.  We  call  the  first  term  in  this  sum  the 
primary  numerical  error. 

The  expected  decay  of  the  contributions  to  the  error  e^g'^  that  are  inherited  from  previous 
iterations  is  illustrated  in  Figure  2. 1 . 


Figure  2.1.  Individual  “history”  terms  for  example  2.1.4  illustrating  the 

expected  decay  in  contributions  to  ejg*  ^  with  index  j. 


2.2.  Estimating  the  numerical  error  for  nonlinear  algebraic  systems 

Nonlinearity  introduces  additional  complexities  for  defining  an  adjoint  problem.  We  solve  the 
nonlinear  equation 

Aw  =  f{w)  (2.12) 


by  successive  approximation 


i  =  1, . . . ,  A: 


(2.13) 


with  =  0.  Given  =  0,  we  compute  the  sequence  of  numerical  approximations  for 
I  =  1, . . . ,  A:  as 

=/(lF<‘-^>).  (2.14) 

Theorem  2.2 

The  error  in  a  quantity  of  interest  is  estimated  as, 

j=i 

where  the  adjoint  problems  are  defined, 

=  Ip, 

j  =  2,...,  k 


(2.15) 


(2.16) 
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using  the  linearization  Lf{v,  V)  defined, 

Lfiv,V)=  f  J(st;  +  (1  - s)V) ds, 
Jo 

where  J(-)  is  the  Jacobian  of  /. 

Proof 

The  steps  are  similar  to  the  previous  arguments. 


2.2.1.  Linearization  and  adjoints  for  nonlinear  adjoints.  In  general,  there  are  multiple  ways  to 
define  an  adjoint  to  a  nonlinear  operator  [10],  Choosing  a  suitable  definition  is  highly  dependent  on 
the  purpose  for  which  the  adjoint  is  intended.  For  a  perturbation  or  error  analysis,  one  systematic 
way  to  define  an  adjoint  is  based  on  linearization.  Consider  an  approximate  solution  ly  a  tw  of 
(2. 1 2)  computed  without  iteration,  and  define  the  residual 


n{W)  =  A.W-f{W). 


The  standard  analysis  begins 


A(ru  -W)-  {f{w)  -  f{W))  =  -n[W). 


The  integral  Mean  Value  Theorem  yields 


f{w)-f{W)^Lf{i 


-W)=  f 
Jo 


J{sw  +  (1  —  s)W)  ds  {w  —  W). 


Introducing  the  formal  adjoint  problem 


leads  to  the  a  posteriori  error  estimate 


(A-L/)>=V' 


{w-w,tp)  =  (-n{w),(i>). 

In  practice,  the  linearization  L  f  cannot  be  computed  and  it  is  approximated  as, 

Lf{v,V)=  f  J(sr;  +  (l-s)y)ds«  f  J(sy  +  (1  -  sJV)  ds  =  J(y). 
Jo  Jo 


Copyright  ©  2010  John  Wiley  &  Sons,  Ltd. 
Prepared  using  nmeauth.cis 


bit.  J.  Numer.  Meth.  Engng  (20 1 0) 
DOI:  10.1002/nme 


OPERATOR  DECOMPOSITION  ITERATION  FOR  ELLIPTIC  SYSTEMS 


9 


The  approximation  in  the  linearization  may  certainly  affect  the  accuracy  of  the  a  posteriori 
error  estimate.  However,  the  effect  can  be  bounded  a  priori  under  regularity  assumptions  on  the 
problem,  i.e.  the  second  derivatives  of  /  are  uniformly  bounded  in  a  compact  region  containing 
the  true  solution,  analytic  iterative  solutions,  and  iterative  numerical  solutions,  and  assuming 
that  the  iteration  converges  and  the  approximations  are  sufficiently  close  to  the  true  solution. 
In  particular  W  should  be  sufficiently  close  to  w  and  TZ{W)  should  be  sufficiently  small.  In 
practice,  the  approximation  (2. 1 7)  works  well  in  many  situations  in  the  sense  that  the  linearization 
error  has  relatively  insignificant  effect  on  accuracy  of  the  estimate  when  the  numerical  solutions 
are  reasonably  accurate.  Perhaps  more  importantly,  catastrophic  failure  of  the  a  posteriori  error 
estimate,  that  is  an  estimate  that  is  low  when  the  actual  error  is  large,  is  relatively  difficult  to 
manage.  See  [8]  for  a  discussion  of  this  point  for  systems  of  reaction-diffusion  equations  and  an 
example  where  catastrophic  failure  is  created. 

However,  this  simple  approach  to  defining  an  adjoint  operator  when  operator  decomposition 
iteration  is  used  can  fail.  The  difficulty  is  that  an  iterative  solution  is  actually  solving  a  problem 
that  is  significantly  different  than  the  original  problem.  See  [2]  for  a  discussion  of  this  point. 

The  approach  to  define  an  adjoint  for  Theorem  2.2  avoids  this  issue  by  constructing  a  global 
adjoint  for  the  iterative  solution  operator  via  a  sequence  of  coupled  adjoints  for  each  individual 
component  problem  obtained  by  linearization  between  the  iterative  analytic  and  iterative 
numeric  W^'-^  solutions.  The  effects  of  this  “local”  linearization  can  be  controlled  under  local 
regularity  and  convergence  assumptions  as  described  above.  In  practice, 

is  approximated  by 

(2.18) 

for  j  =  2, . . . ,  k. 

The  price  of  the  indicated  approach  to  defining  an  adjoint  for  the  iterated  solution  operator  is  that 
the  a  posteriori  error  estimates  accounts  only  for  the  numerical  error  and  leaves  the  analytic 
iterative  error  remaining  to  be  estimated.  We  adapt  the  estimate  discussed  in  Sec.  2. 1  .,2. 


2.2.2.  Numerical  examples.  We  now  consider  three  new  situations  that  may  arise  for  nonlinearly 
coupled  problems.  Let 

Aw  =  fi'ut), 


where 


and  f{w)  =  a 


Qxp{-P{w3  -  0.4)2) 
exp{—P{w4  —  0.6)2) 

exp(— ;0(u;i  -  0.5)2) 
exp(-/3(u;2  -  0.3)2) 


and  let  the  quantity  of  interest  be  the  value  of  W2.  In  all  cases,  a  high  accuracy  reference  solution 
w  was  generated  using  Newton’s  method  with  a  tolerance  of  10“^2  while  we  also  computed  high 
accuracy  reference  iterative  solutions  to  full  precision.  To  compare  to  estimates  employing  the 
adjoint  problem  associated  with  the  original  system,  the  “global”  adjoint  corresponding  to  the  full 
(i.e.  non-decomposed)  problem  was  also  constructed  and  solved  and  error  estimates  based  on  the 
adjoint  to  the  full  problem  are  reported  in  the  following  examples.  We  set. 


•  TZ{W,(j))  is  the  error  estimate  obtianed  by  solving  an  adjoint  problem  using  the  exact 
linearization  of  the  global  adjoint  problem,  i.e.  linearizing  around  the  average  of  the  reference 
solution  w  and  the  approximate  solution  W,  while 

•  TZ{W,^)  is  the  error  estimate  obtained  from  using  an  approximate  linearization  of  the  global 
adjoint  problem. 

The  approximate  solutions  were  computed  by  rounding  to  the  6th  decimal  place.  All  adjoint 
solutions  were  computed  with  full  precision.  In  each  case  we  estimated  the  iteration  error  using 
(2.11). 
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Increasing  the  value  of  a  essentially  reduces  the  diagonal  dominance  of  the  operator 
decomposition,  damaging  iterative  convergence,  while  increasing  ft  raises  the  linearization  error. 


Cancellation  between  iteration  and  numerical  error.  Let  a  =  \,  ^  =  \Q,i  —  40.  The  iteration 
converges  after  40  iterations  when  the  norm  of  the  residual  is  less  than  10“^.  Results  are  provided 
,in  Table  2.2. 


Error 

Estimated  Error 

Practical  Error 

Iteration  Error 

\W2  -  W2 1 

Tl{W,4>) 

W2  —  ^2*^ 

1.1102276  X  10-*^ 

1.1102276  X  10“® 

1.10983  X  10"® 

-2.55547  X  10"® 

Numerical  Error 

Est.  Numerical  Error 

Pract.  Numerical  Error 

Est.  Iteration  Error 

1  1.44524  X  10-® 

1  1.44524  X  10"® 

1  1.44592  X  10"® 

1  -6.6554  X  10"®  1 

Table  2.2.  Error  components  for  example  2.2,2.  Note  the  cancellation  of  errors  between  the  iteration  and 
numerical  error.  (  (^1^1).) 


Notice  that  the  adjoint  to  the  coupled  problem  gives  an  excellent  estimate  of  tbe  error,  yielding 
four  significant  figures  even  when  using  the  approximate  linearization.  What  is  obscured  is  the 
cancellation  that  occurs  between  the  iteration  and  numerical  error.  The  methods  developed  here 
enable  the  iteration  and  numerical  errors  to  be  estimated  separately  and  this  is  important  when 
constructing  adaptive  algorithms.  However,  the  iteration  error  estimate  is  polluted  by  the  numerical 
error  in  The  partial  sums  of  the  history  terms  are  plotted  in  Figure  2.2  where  expected  decay 
of  history  error  contributions  can  be  observed. 


Figure  2.2.  Numerical  error  estimate  XTJli  -'+1}  [/{fc  2}^  including  m  “history”  terms  for 

Ex.  2.2.2. 


The  effect  of  linearization  error.  Let  a  =  2,  =  8,  i  =  100.  In  this  example,  the  iteration  fails 

to  converge.  In  addition,  and  approach  they  same  fixed  point,  but  hey  both  do  so  in  a 
non-monotonic  fashion.  This  produces  significant  differences  between  and 

J {W )'''  (see  equations  (2. 1 6)  and  (2.18)  respectively)  and  consequently  significant  differences 
in  the  corresponding  adjoint  solutions  (p  and  <^.  Results  are  provided  in  Table  2.4  where  the  practical 
numerical  error  estimate  is  seen  to  be  completely  inaccurate. 
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To  obtain  an  estimate  of  how  rapidly  the  adjoint  solutions  (j>  and  ^  can  diverge,  we  see  from  (2. 1 6) 
and  (2.18)  that 

so  the  spectral  radius  of  A“^  can  be  considered  as  an 

amplification  factor  to  explain  the  exponential  accumulation  of  linearization  error  in  the  history 
error  estimate. 


Error 

Estimated  Error 

Practical  Error 

Iteration  Error 

102  ~  ^2 

n{w,^) 

W2  —  r02^^ 

-0.048166. 

0.048166 

7.06634 

-0.050168 

Numerical  Error 

Est.  Numerical  Error 

Pract.  Numerical  Error 

Est.  Iteration  Error 

lof  ^ 

0.002002 

0.002002 

3.58  X  10*^ 

-0.04568 

Table  2.3.  Error  components  for  Ex.  2.2.2,  with  1, 0^1- i}j_ 


Figure  2.3.  Differences  between  and  and  the  spectral  radius  of  A  {Lf{w,  W)  —  J{W))^  for 

Ex.  2.2.2. 


Divergent  iteration  Let  a  =  2,  P  =  7,  k  =  100.  The  iteration  fails  to  converge,  but  all  error 
estimates  in  Table  2.4  are  well-behaved  but  exhibit  significant  linearization  error.  Not  surprisingly 
since  the  iteration  error  estimate  assumes  a  convergent  iteration,  the  estimate  of  the  iteration  error  is 
poor.  However,  despite  the  fact  that  the  iteration  is  divergent,  the  estimate  of  the  numerical  error  is 
accurate  and  the  effectivity  ratio  (defined  as  the  ratio  of  the  error  estimate  to  the  true  error)  shown 
in  Figure  2.4  improves  as  the  number  of  history  terms  is  increased,  although  the  practical  numerical 
error  estimate  is  affected  by  errors  resulting  from  the  linearization. 
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Error 

Estimated  Error 

Practical  Error 

Iteration  Error 

|lU2  -  VF2I 

TliwJ) 

W2  — 

0.0999965 

0.0999965 

0.076506 

0.0999963 

Numerical  Error 

Est.  Numerical  Error 

Pract.  Numerical  Error 

Est.  Iteration  Error 

1.8524  X  10-6 

1.8524  X  10-6 

1.8530  X  10-6 

0.18273 

Table  2.4.  Error  components  for  Ex.  2.2.2.  =7l(W^'‘  J^). 


Figure  2.4.  Effectivity  ratio  for  the  numerical  error  estimate  Y^jLi  including 

■m  “history”  terms  for  Ex.  2.2.2. 


2.2.3.  Discussion.  An  interesting  case  is  the  situation  in  which  the  numerical  error  is  much  higher 
than  the  stopping  criteria  (say  round  U  to  the  third  decimal  place),  but  the  corresponding  theoretical 
iteration  converges  quickly.  The  computation  does  not  converge  due  to  the  numerical  error,  but  the 
quick  computation  of  a  few  history  terms  leads  to  an  exact  estimate  of  the  numerical  error.  See 
Section  4.2.4  for  an  adaptive  solution  to  a  similar  problem. 

In  all  three  examples,  the  global  adjoints  are  not  diagonally  dominant  (or  SPD).  But  even  when 
the  operator  decomposition  iteration  solution  for  the  global  adjoint  {e.g.  a  “Jacobi”  iteration)  does 
not  converge,  the  a  posteriori  error  analysis  still  provides  meaningful  error  estimates.  See  Section 
4.2.2  for  more  discussion. 


3.  ANALYSIS  FOR  SEMILINEAR  ELLIPTIC  SYSTEMS 

For  ease  of  presentation,  we  focus  the  analysis  on  a  two  component  fully  coupled  elliptic  system  of 
the  form 


—V  ■  ai(a:)Vui  4-  b\{x)  •  Vtij  +  Ci(x)ui  =  /i(x, U\,U2,  DU2),  x  E  fl, 

-V  •  a2(x)VU2  +  62(1)  •  Vti2 +C2(x)U2  =  /2{x,  Ui,Dui,U2),  X  €  fl,  (3.1) 

m  =  U2  =  0,  X  €  9n, 

where  fl  is  a  convex  polygonal  domain  in  IR',i  =  1,2,3,  with  boundary  dD,  and  we  assume  that 
flt,  bi,  Ci,fi,i=  1, 2  are  sufficiently  smooth  to  establish  optimal  order  a  priori  convergence  for  the 
finite  element  method  computed  without  operator  decomposition  iteration. 
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The  weak  formulation  of  (3.1)  is:  find  G  (fl)  satisfying 

-4i(ui,ui)  =  (/i(x,ui,U2,Du2),ni),  'ivi  6  W^2(f)), 

A2(u2,V2)  =  {f2{x,Ui,Dui,U2),V2),  VU2  G  W^2(f)), 

where 

A\  =  (aiVui,  Vni)  +  (6]  •  Vui,ni)  +  (ciUi,i;i), 

A2{u2,V2)  =  (a2Vu2,Vi;2)  +  {h  •  Vu2,n2)  +  (02^2,1^2), 

are  assumed  to  be  coercive  bilinear  forms  on  H  and  Wp(fl)  represents  the  subspace  of  V7p‘(f))  with 
zero  trace  on  5n. 

If  we  were  numerically  solving  (3.2)  without  operator  decomposition  iteration,  we  would 
introduce  conforming  discretizations  <Sh,m(f^)  ^nd  then  solve  the  discretized  system:  find  C/„,  £ 
satisfying 

Ai{Ui,xi)  =  {fiix,Ui,U2,DU2),Xi), 

A2{U2,X2)  =  {f2(x,UuDUuU2),X2), 

3. 1.  Analysis  of  operator  decomposition  iteration  solutions 

We  analyze  three  different  operator  decomposition  iterations  for  producing  numerical 
approximations  of  (3. 1 ).  We  recall  that  we  decompose  the  error  as  in  ( 1 .2),  i.e., 

analytic  iteration  error  numerical  error 

where  respectively  are  the  analytic  and  discrete  solutions  obtained  by  the  particular 
iteration  approach  under  consideration.  The  a  posteriori  error  analysis  is  for  the  numerical  error 
e{*>.  We  estimate  the  analytic  iteration  error  by  a  natural  application  of  the  estimates  discussed 
in  Sec.  2.1.3.  To  simplify  the  analysis,  if  the  resulting  operator  decomposition  elliptic  equation  for 
C/‘  is  nonlinear  (in  C/‘),  we  assume  that  the  error  introduced  by  its  nonlinear  solve  is  negligible. 

For  simplicity  of  presentation,  we  assume  the  quantity  of  interest  is  given  as  a  linear  functional 
of  the  second  solution  component,  determined  by  ^  =  {0,^2)  >  >  e.  we  estimate  For 

notational  convenience  and  to  be  consistent  with  [I],  we  abbreviate  the  weak  residual  of  a  solution 
component, 

=  {fvM.x)  -  rn=  1,2. 

3.1.1.  Block  Jacobi  iteration  We  first  consider  a  block  Jacobi  operator  decomposition  iterative 
approach  to  the  solution  of  the  semilinear  elliptic  system  (3.1). 


3^X1  £  ‘5h,m(f2), 
''^X2  G  Sh,Tn{^)- 


13 


(3.2) 


Algorithm  1  Block  Jacobi  algorithm 


Given  and 

for  i  =  1, 2, 3, . . .  until  convergence  do 

FindC/<'>  G5h,i(n)  9  A,{ul'\xi)  =  (/,(x. C/<‘\ 

r{*) 


Find 


end  for 


G5,..2(n)  9  A2{U^^Kx2)  =  (/2(x,[//'-‘\z?c//'-'^ 


{1-1} 
■{i} 


,Dur'‘),xi)^xi&^hAn) 


,u^"n,X2)Vx2&Sf,,2m 


Theorem  3. 1 

The  representation  formula  for  the  numerical  error  is 

j<k,  j  odd  j  even 

(3.4) 
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where  the  corresponding  adjoint  problems  are 

j<k,j  odd, 

■^i{x,<i>i^)  =  {x,'>Pi^),  Vx  e  1^2  (n),  j  <k,  j  even, 

with 

-At(n,u;)  =  {Vv,aiVw)  -  (n,div(6iw))  +  (n,(ci  -  Ji^i{U))w), 

Al{v,w)  =  (Vv,a2Vw)  -  (v,div(b2Vj))  +  (n,  (cg  -  J2,2{U))'w), 
and  the  additional  adjoint  data  is  defined  recursively  as 

=  0, 

(x,V'2'^')  =  j  <  k,  j  odd,  (3.7) 

j  <  k,  j  even, 

where  J,n, ri{V)  is  the  Jacobian  of  f„,  with  respect  to  u„  evaluated  at  V. 

If  we  wish  to  highlight  the  final  (fcth)  iteration,  then  we  can  write 

2<j<A;,  j  even 

+  ^  (3-8) 

3<j</c,  j  odd 

Proof 

In  the  following,  we  simplify  notation  in  the  functions  /,  so  for  example  we  write 
/i(x,wi,it2,  Du2)  =  /i(u2),  and  so  on. 

=  {e2*\4^') 

=  ^2(e2*^</’20 

=  A2{el,''\4>2^) 

=  (/2([/f  "'^4'’)  “^2(f/j"\4‘')  +  (/2(4'"'^),4'')  -  (/2(f//""'^).4‘’) 

=  n2{ut\4^-,ul'~"^)  +  (L/2(4"“'\[//"“'^)ep-'\4'') 

=  7^2(^/2^"^4^;[//'=“'^)  +^t(ep-‘\4"') 

+  (L/i  ,  4"1) 


j<k,  j  odd 


j<kf  j  even 


Copyright  ©  2010  John  Wiley  &  Sons,  Ltd. 
Prepared  using  nmeouth.cis 


hit.  J.  Numer.  Meth.  Engng  (2010) 
DOI:  10.1002/nme 


OPERATOR  DECOMPOSITION  ITERATION  FOR  ELLIPTIC  SYSTEMS 


15 


□ 


The  estimate  in  this  case  takes  into  account  the  numerical  error  arising  from  each  component 
solution  and  the  inherited  errors  passed  between  iterations.  Following  the  discussion  in  Sec.  2.2.1, 
we  are  using  the  adjoint  naturally  associated  with  the  analytic  iterative  solution  operator.  We  use 
“local”  linearization  between  and  to  define  the  required  adjoints  for  each 

component  equation  in  the  iteration.  The  effects  of  this  linearization  can  be  controlled  assuming 
sufficient  regularity  of  the  solution  and  a  priori  convergence  of  the  discretization,  and  we  expect  the 
estimates  to  be  accurate  for  all  sufficiently  accurate  numerical  solutions.  The  global  adjoint  of  the 
iterated  solution  operator  is  therefore  obtained  by  a  sequence  of  “single  physics  solves”.  We  note 
that  we  still  have  to  estimate  the  analytic  iteration  error 


3.1.2.  Block  Gaiiss-Seidel  iteration  Next,  we  consider  a  block  Gauss-Seidel  operator  decomposi¬ 
tion  iteration. 


Algorithm  2  Block  Gauss-Seidel  algorithm 


Given 

for  i  =  1, 2, 3, . . .  until  convergence  do 
Findf/f'>  e5,,i(n)  3 

Find  €  S>M  3  A2{U^\x2)  =  ,DUl'\u^^),X2)'ix2  €  5, .,2(0) 

end  for 


Theorem  3.2 

The  representation  formula  for  the  numerical  error  is 


'  ...  .  ..  " 


within  iteration  errors 


between  iteration  errors 


where  the  corresponding  adjoint  problems  are 


=  (x,V'2^').  yxewjin),  j  =  i,...,k, 
Vx € Wj{n),  j  =  i,...,k, 


and  the  adjoint  data  is  defined  recursively  as 

=  ■V'.'V'l*'  =  0, 

(x,V'2^'')  =  j  =2,...,A:,  (3.12) 


Proof 
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7e2(t/f 

7^2(^/j"^4'i^^/"^)  +  (/l(t^2"“‘^).<^l'') 
+(/i(4*^"'^).4")  -  (/i(t^2‘'^“'’).4'') 
7^2  (t/f  ^ ,  4" ;  ,  4“ ;  ) 


{^“J+*}  ^[2l.rr{^~2} 


,4r<^2  )■  (3-13) 


i=i 


2  =  1 


□ 


In  this  case,  in  contrast  with  (3.4),  there  are  contributions  to  the  error  reflecting  both  “within”  and 
“between  iteration”  errors. 


i.  J.3.  Relaxed  block  Jacobi  iteration  Both  the  Jacobi  and  Gauss-Seidel  operator  decomposition 
iterations  can  be  “relaxed”  by  letting  +  (1  —  where  a  tilde  denotes  a 

quantity  computed  before  relaxation.  The  approximation  is  obtained  via 


Algorithm  3  Relaxed  block  Jacobi  algorithm 


Given  and  , 

for  i=  1, 2, 3, . . .  until  convergence  do 


Find(7/‘>  e5h,,(n)  9  ,4l(C/‘‘^Xl)  =  {/.(x, [/<'>,  Df/^ 


rlT 


Calculate  new  iterate  C//‘^  =  auj’^^  +  (1 


a)U} 


{*-1} 


FindC/i'>  e5,,,2(n)  9  ,42(t/2'‘^X2)  =  (/2(x,t/<'-' >,£)[/<'-*>,(/<*> 


Calculate  new  iterate  —  cnU^^^  +  (1  “  <^}^2 


r{i-l) 


end  for 


),xi)Vxi  e5h,i(n) 

)i  X2)  Vx2  e  <5/1,2  (fi) 


This  is  often  done  in  practice  in  order  to  aid  convergence  of  the  iteration,  but  poses  more 
challenges  for  a  posteriori  error  analysis  and  we  present  a  partial  analysis  to  explain  how  the  results 
can  be  extended  to  handle  relaxation.  Letting  and  represent  the  corresponding 

quantities  computed  without  relaxation,  we  have 

(u  —  =  [u  —  +  Q(e^*^^,’0)  +  (1  —  ,ipy  (3.14) 
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Using  the  notation  above,  we  have 

=  a  [7^2(^/i'=^4'l; +  (/2{nf-'^4i)  -  (/2(C/^'‘'^), 4')] 

+(1  -Q)(ef~*\V’2') 

+(1 

+a(l  -  a)  [7l2(uf-'>,41iC/f-">)  + 

(1  -a)2(ef“^\V'2‘’) 

i=l 

In  order  to  obtain  a  full  a  posteriori  error  estimate,  we  have  to  repeat  this  argument  to  estimate 
j  =  li  •  •  • ,  A:.  We  refrain  from  doing  this.  Clearly  repeated  application  of  this  analysis 
approach  results  in  a  great  number  of  adjoint  problems  to  be  solved.  It  is  to  be  hoped  that  introducing 
relaxation  means  that  relatively  few  number  history  terms  need  to  be  included  for  accuracy  in  the 
estimate.  As  expected  the  decay  rate  of  the  history  terms  decreases  as  a  0. 


4.  NUMERICAL  EXAMPLES  FOR  FULLY  COUPLED  SEMILINEAR  ELLIPTIC  SYSTEMS 

We  present  a  number  of  examples  to  illustrate  characteristics  of  the  a  posteriori  error  estimate.  We 
also  illustrate  the  use  of  the  estimates  as  the  basis  to  adaptively  adjust  discretization  parameters 
based  on  relative  sizes  of  contributions  to  the  error  estimate. 

Some  of  the  examples  are  used  to  explore  the  accuracy  of  the  a  posteriori  error  estimate  in 
situations  in  which  the  operator  decomposition  iteration  is  converging  well.  To  measure  the  accuracy 
of  the  error  estimate,  we  report  on  the  ratio  of  the  estimate  to  the  true  error  (or  an  approximation 
computed  with  a  highly  accurate  reference  solution).  This  type  of  a  posteriori  error  estimate  tends 
to  be  robustly  accurate  for  a  wide  range  of  spatial  meshes,  as  has  been  well  reported  in  the  literature. 
We  observe  the  same  robust  accuracy  in  the  estimates  when  the  iteration  is  converging  well.  Rather 
than  reporting  extensively  on  that  quality,  we  concentrate  the  experiments  on  situations  in  which  the 
estimates  do  not  work  as  well  as  hoped. 

For  adaptive  error  control,  we  employ  a  natural  generalization  of  the  “mark  and  refine”  strategy 
based  on  the  Principle  of  Equidistribution  [l,.S-8].  In  this  approach,  the  error  estimate  is  replaced 
by  a  bound  obtained  by  replacing  the  signed  contributions  to  the  error  estimate  with  their  absolute 
values.  Starting  with  a  coarse  discretization  -  mesh  with  large  element  size  and  small  number  of 
iterations  -  we  adjust  the  various  discretization  parameters  according  to  the  relative  size  of  the 
contributions  and  an  estimate  of  the  computational  costs  associated  with  changes  in  the  parameters. 
This  adaptive  approach  is  described  fully  in  the  earlier  paper  [  I  ]. 

We  make  one  simplification  below.  In  [I],  we  allowed  the  different  components  to  have  different 
meshes,  which  requires  modification  of  the  a  posteriori  error  analysis  to  account  for  the  effect  of 
mesh  changes  on  the  numerical  solution.  Below,  we  use  the  same  mesh  for  all  components  of  the 
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system  in  order  to  avoid  introduction  of  the  additional  terms  in  the  a  posteriori  error  estimate.  The 
results  from  the  first  paper  can  be  combined  with  the  results  in  this  paper  in  a  straightforward  way. 


4. 1.  Fully  coupled  linear  system 


The  first  example  is  a  coupled  system  of  linear  equations  for  which  we  can  compute  an  exact 
solution.  The  coupled  problem  is:  Find  uj  (i,  y),  U2{x,  y)  satisfying, 

-V  •  (aVui)  +  6|  ■  Vui  = /i(«2),  X  e  n, 

— Au2  =  1>2  ■  Vui,  X  e  n,  (4.1) 

m  =u-2  =  0,  X  e  dQ, 


where  n  :=  {(x,y)  ;  {x,y)  e  (0, 1]  x  [0,1]}, 


a  =  10  -f  (—  cos  Attx  4-  -  cos  ny) 

X  (j  ^ 


1 

■  8 

1  . 

T 

1  1 

1  .  . 

tt 

sin47ra; 

1  (j 

-sin  Try 

1 

II 

<N 

-0 

-sm47rx  2  sin  Try 

lT 


fi{u2)  =  857r'(2sin47risin7r?/ +  U2). 


This  system  has  exact  solution 


111  =  sin47rxsiii7r2/ 

U2  =  7r“^(sin87risiii7rj//65 +sin47risin27r7//20). 

We  select  the  quantity  of  interest  112(2^0)  where  xq  =  (0.15, 0.15);  the  adjoint  data  is  then  equal  to 
which  we  regularize  and  denote  as 

As  mentioned  we  solve  for  ui  and  U2  on  a  common  mesh,  using  piecewise  linear  finite  elements. 
The  Gauss-Seidel  iteration  proceeds  until  <  10“^.  Quadratic  finite  elements 

are  used  to  compute  $5.^'  and  approximations  to  and  Starting  with  a  uniform  initial 
mesh,  the  subsequent  meshes  are  adapted  identically  using  the  sum  of  just  the  first  four  terms  in  the 
error  representation  formula,  namely,  the  sum  of: 

1.  the  “primary”  error,  £1  = 

2.  the  “transfer”  error,  £2  = 

3.  the  “inherited”  error,  £3  =  72-2  and 

4.  the  “inherited  transfer”  error,  £4  =  TZ\ 

The  corresponding  adjoint  problems  used  to  compute  the  weighted  residuals  are 

-42(x.4‘')  =  (x,<^i:®),  Vxe52^(n), 

-4I(x.0f' )  =  (x,  -div(620W)).  vx  e  s^{n), 

^2(x.4'')  =  (x,  -div(6i,Af’)),  Vx  e 

“^1  (x>0f’)  =  {x.-div(M2^')).  Vx  e  s^{n). 

The  initial  mesh  is  a  uniform  partition  with  step  size  h=  .1  for  200  elements.  The  iteration  is 
repeated  until  the  total  error  representation  estimate  for  the  quantity  of  interest  is  less  than  10“'*. 
The  adaptive  algorithm  ran  for  three  iterations  before  meeting  the  tolerance.  The  final  mesh  has 
2378  elements.  The  number  of  iterations  used  in  Gauss-Seidel  on  the  finest  (last)  mesh  is  5.  The 
iteration  error  estimate  reports  1.43  x  10“®. 

The  effectivity  ratio  (estimate/error)  is  .9976. 

Table  4.  i  shows  the  contributions  to  the  error.  We  observe  that  the  “transfer”  error  £2  is  about 
l/5th  of  the  size  of  the  “primary”  discretization  error  at  iteration  £1,  and  that  the  “inherited  transfer 
error”  £4  is  1/lOth  of  the  size  of  the  primary  discretization  error.  By  contrast  £3,  the  error  inherited 
from  U2  at  the  iteration  {k  —  1),  is  l/500th  the  size  of  £]. 
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Primary  Error  (ei) 

Transfer  Error  (62) 

0.5070  X  lO-'* 

0.0940  X  lO""* 

Inherited  Error  (ef) 

Inherited  Transfer  Error  (^4) 
Til  (£//*--•>,  4-'; 

-0.0010  X  10-^  0.0409  X  lO”"* 

Table  4.1.  The  first  four  error  terms,  i.e.,  the  primary  error  and  transfer  error  at  iteration  k  and  the  inherited 
error  and  inherited  transfer  error  from  the  iteration  (k  —  1)  for  Ex.  4. 1 . 


(c)  Solution  for  $ 


(d)  Solution  for 


Figure  4.1.  The  first  four  adjoint  solutions,  i.e.,  the  primary,  transfer  and  the  first  two  inherited  adjoint 

solutions  for  Ex.  4.1. 


4.2.  Effects  of  iteration  on  the  accuracy  of  the  error  estimate 

We  next  consider  a  fully  coupled  semilinear  system  that  is  solved  by  the  Jacobi  operator 
decomposition  iteration  employing  continuous  piecewise  linear  elements  with  initial  solutions 
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{7/°^  =  X,  1/2°^  —  0.  The  coupled  system  is:  Find  Ui{x),U2{x)  for  1  e  =  [0, 1]  such  that 

-u"=  x&n, 

—u'2  =  sin(/87rixi),  i  €  H, 

iti(O)  =  0,  ui(l)  =  1,  U2(0)  =  1x2(1)  =  0. 

The  quantity  of  interest  is  the  average  value  of  uo  over  the  whole  domain.  The  sequence  of  adjoint 
problems  are 

a:en, 

-(•^2*)"  =  V’2’^  2:en, 

^l(0)  =  <?i2(0)  =  0,  ,^,(1)  =  ^2(1)  =  0, 

with 

=  1 

j  odd 

=  /37rcos(^irf/j^~'^^)<^2  J  even. 

Quadratic  finite  elements  are  used  to  compute  #5;’^  and  $2^  approximations  to  (f>^f  and  (/>2 '  and  the 
iteration  is  performed  until  the  iteration  error  estimator  (2. 1 1)  is  less  than  10~®  or  until  a  maximum 
of  30  Gauss-Seidel  iterations  are  performed. 

We  examine  different  sets  of  parameters  that  affect  the  convergence  of  the  iteration  and, 
consequently,  accuracy  of  the  estimate. 

4.2.1.  Slowly  decaying  history  contributions.  The  first  example  demonstrates  problems  that  can 
arise  when  the  history  contributions  decay  very  slowly.  We  fix  /3  =  10  and  compute  approximations 
for  A  =  2,4,8, 16.  As  a  grows  the  nonlinearity  becomes  strong  and  the  diagonal  dominance  in  the 
iteration  is  weakened.  We  use  a  uniform  mesh  with  h  =  0.05  for  the  approximate  solutions  and 
we  use  a  fine  mesh  with  h  =  .005  to  compute  a  “reference”  solution.  We  run  the  iterations  until 
IlC/lff  _c/{‘-i}||  <  io~®  up  to  a  maximum  of  30  iterations. 

The  iteration  converges  for  A  =  2,4  and  8,  but  for  A  =  16  the  iteration  cannot  converge  (without 
relaxation)  even  using  a  finer  space  mesh.  We  report  the  error  estimate  effectivity  (estimate/error) 
ratios  in  Table  ??. 


A  1 

ratio 

2 

0.999942588 

4 

0.997737471 

8 

0.978260934 

16 

1.647440221 

Table  4.2.  Effectivity  ratios  for  Example  4,2. 1 . 


These  results  the  typical  accuracy  of  the  estimate.  The  estimate  is  even  reasonably  accurate  in  the 
last  case  where,  as  noted,  the  iteration  does  not  converge.  This  estimate  in  the  last  case  is  affected 
by  significant  linearization  errors  since  U2  does  not  represent  1x2  very  well. 

Figure  4.2  presents  the  partial  sums  of  the  first  j  history  terms  for  the  different  values  of  A.  For 
A  =  2, 4, 8,  the  error  contributions  become  relatively  constant  when  more  than  five  error  terms  are 
included,  but  in  all  cases  taking  simply  the  first  error  term  only  (the  “primary  error  contribution”) 
leads  to  a  poor  error  estimate. 
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Figure  4.2. Numerical  error  estimate  including  m  history  terms,  for 

Ex.  4.2.1  and  A  =  2, 4, 8, 16. 


4.2.2.  The  effect  of  large  numerical  error  on  the  history  contributions.  Let  A  =  10,  ,3  =  10,  h  = 
0.05.  The  fixed  point  iteration  fails  on  a  coarser  mesh  of  /i  =  0.2,  but  for  h  =  0.05  the  iteration 
converges  slowly.  In  this  computation,  the  error  estimate  is  affected  both  by  the  inaccuracies  due  to 
linearizing  about  for  i  relatively  small  as  well  as  the  poor  numerical  resolution  of  the  adjoint 
solution  $  (computed  using  quadratic  finite  elements  on  the  mesh  for  U).  Both  sources  of  numerical 
error  destroy  the  accuracy  of  the  estimate.  We  see  the  effect  in  Figure  4.3,  where  we  observe  that 
the  contributions  to  the  error  estimator  increase  as  the  final(“older”)  history  terms  are  included, 
after  having  reached  an  earlier  plateau.  This  is  the  opposite  of  the  expected  behavior  that  error 
contributions  should  decay,  and  is  in  an  indicator  that  the  error  estimator  is  not  performing  well. 
Large  adjoint  solutions  relative  to  the  size  of  the  solution  can  be  an  indicator  of  unreliable  error 
estimates. 


Figure  4.3.  Numerical  error  estimate  including  m  history  terms  for 

Ex.  4.2.2,  demonstrating  that  numerical  error  can  lead  to  non-convergence  and  poor  numerical  error 

estimates. 


4.2.3.  The  effect  of  a  poor  initial  iterate  on  the  history  contributions.  Let  A  =  10,  /9  =  10,  h  = 
0.05.  The  choice  of  initial  data  for  the  iteration  can  strongly  influence  the  performance  of  the 
error  estimator.  In  Figure  4.4  we  plot  the  error  contributions  for  =  0  versus  =  x(l  —  x). 
In  this  case,  the  norm  of  several  of  the  early  solution  iterates  is  greater  than  lO'"*,  while  the 
norm  of  the  converged  is  0(1).  The  difficulty  is  that  each  history  adjoint  problem  is 
solved  using  quadratic  finite  elements  of  size  h  =  0.05,  and  the  resulting  use  of  $1^1  instead 
of  (f)^^  in  the  resulting  error  representation  formula  yields  an  error  which  may  be  bounded  by 

(assuming  smooth  i/),  and 
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u).  The  relative  size  of  and  means  that  the  resulting  error  estimate  is  useless,  despite 

the  exponential  decay  of  as  j  increases.  Incorporating  additional  history  terms  can  reduce  the 
quality  of  the  estimator. 


Figure  4.4.  Individual  “history”  terms  TZ[U^^  jI)  for  Ex.  4.2..^i,  illustrating  the  sensitivity 

of  history  error  contributions  to  initial  data. 

4.2.4.  Adaptive  selection  of  relaxation  parameter  and  mesh  resolution.  In  the  three  previous 
examples,  problems  with  accuracy  of  the  error  estimate  arise  from  slow  convergence  of  the  iteration 
and  subsequent  slow  decay  of  the  history  contributions  to  the  error.  In  Sec.  .Tl.\  we  described 
an  extension  of  the  a  posteriori  error  estimate  to  a  version  of  the  Jacobi  iteration  that  employs 
relaxation.  The  relaxation  parameter  directly  enters  into  the  history  contributions  of  the  error 
estimate. 

This  suggests  a  generalization  of  an  adaptive  strategy  in  which  the  decay  of  the  history 
contributions  is  monitored,  and  the  iteration  is  interrupted  and  restarted  with  a  new  relaxation 
parameter  value  if  the  history  contributions  are  contributing  too  much  relative  to  the  other  error 
contributions.  The  efficiency  question  is  balancing  the  error  contributions  between  those  arising 
from  the  discretization  against  those  arising  from  the  iteration. 

Consider  once  again  the  problem  in  §4.2,  with  quantity  of  interest  equal  to  the  value  of  ii2  at 
a:  =  0.1  (adjoint  data  ^2^  =  <5o.i).  When  ^  =  10  and  A  =  20  this  iteration  cannot  converge  without 
relaxation,  regardless  of  mesh  density. 

We  present  the  adaptive  algorithm  in  Alg.  4.  We  begin  the  adaptive  procedure  using  a  coarse 
common  mesh  for  both  Ui  and  U2  with  h  =  .2.  We  perform  a  Jacobi  iteration  with  no  relaxation 
until  the  convergence  criterion  (||(7‘  —  (/‘~^||oc  £  10~^)  is  satisfied  or  a  maximum  of  8  iterations 
is  reached.  At  this  point,  the  (computable)  iteration  error  estimate  (2.1 1)  is  calculated.  We  then 
compute  the  error  representation  formula  (3.4)  using  a  variable  number  of  history  terms.  In  general, 
the  minimum  number  of  history  terms  is  set  to  be  min{4,A:),  but  the  number  of  history  terms 
computed  is  halted  if  the  jth  history  term  is  too  small  (<  CihStot)  or  too  large  (>  C2/i“'ftot) 
where  Ci  =0.1  and  C2  =  10.  We  adapt  the  common  mesh  for  U\  and  U2  using  a  standard  “mark 
and  refine”  strategy  applied  to  the  a  posteriori  estimate  using  || local  error||  <TOL/(#  of  elements) 
as  a  marking  criterion.  If  the  iteration  does  not  “converge”  or  if  the  iteration  error  estimate  is  greater 
than  the  numerical  error  estimate,  we  increase  the  relaxation  parameter  a  and  repeat  the  process. 
Various  choices  for  updating  a  cam  be  employed  at  this  step.  The  process  continues  until  both  the 
iteration  and  numerical  error  estimate  are  less  than  10~^. 

We  show  the  final  adaptive  solution  for  U2  in  Figure  4..S.  Generally,  a  quantity  of  interest  of  the 
value  at  a  point  leads  to  a  mesh  that  is  highly  refined  in  a  local  neighborhood  of  the  point.  But  the 
final  refined  mesh  is  nearly  uniform  as  we  see.  This  is  a  consequence  of  the  errors  inherited  from 
previous  iterations  coupled  to  the  fact  that  the  solutions  have  significantly  different  scales,  i.e.  Ui  is 
0(1)  while  U2  is  O(10“3)).  Thus,  the  localized  contributions  that  should  result  in  the  refinement 
of  the  mesh  for  U2  near  x  =  .1  are  masked  by  the  much  larger  -  and  nearly  uniformly  distributed  - 
contributions  from  Ui  in  previous  iterations. 
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Algorithm  4  Adaptive  algorithm  including  relaxation  adjustment 

while  total  error  <  TOL  do 

while  iterations  <  k  OR  it.  error  >  TOL  do 
Compute  [/L+i} 

end  while 

Compute  $1^1, . . .  1  <  m  <  i 

while  m  <i  or  not  STOP  do 

if  mth  error  term  >  Cih  (error  terms)  then 

if  mth  error  term  <  C2h~^  (error  terms)  then 

m  =  m  +  1 
Compute 
else 

STOP=true 
end  if 
else 

STOP^true 
end  if 
end  while 

Refine  mesh  using  Non-Iteration  error  estimate 
if  Iteration  error  >  Non-Iteration  error  then 
update  relaxation  parameter  a. 

end  if 
end  while 


Figure  4.5.  Final  adapted  solution  for  U2  for  Ex.  4.2.4  showing  the  spatial  grid. 


5.  CONCLUSIONS 

We  have  presented  an  a  posteriori  error  analysis  for  operator  decomposition  iterative  solution  of 
systems  of  coupled  semilinear  elliptic  systems  that  use  a  block  iterative  solution  technique.  The 
analysis  provides  the  means  to  compute  accurate  error  estimates  that  account  for  discretization 
errors  in  the  solution  of  each  component  at  a  given  iteration,  the  errors  passed  between  components 
at  a  given  iteration,  numerical  errors  inherited  from  previous  iterations  and  errors  arising  due  to 
the  iterative  solution  procedure.  This  paper  specifically  addresses  the  propagation  of  error  between 
iterates  in  the  operator  decomposition  iteration  solution  and  the  effects  of  finite  iteration  on  the  error 
estimate.  We  extend  the  adaptive  discretization  strategy  in  [  1  ]  to  systematically  reduce  the  error. 
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A-posteriori  error  estimates  for 
mixed  finite  element  and  finite  volume  methods 
for  problems  coupled  through  a  boundary 
with  non-matching  grids 

T.  Arbogast  f,  D.  EstepX,  B.  Sheehan^  and  S.  Tavener^ 

[Received  on  12  March  2013] 

Using  an  adjoint  based  a-posteriori  error  estimate,  we  explore  the  accuracy  of  two  different  discretization 
sehemes  applied  to  elliptie  problems  in  which  there  are  different  meshes  on  two  neighboring  subdomains 
that  share  an  interface.  The  first  discretization  is  a  mixed  finite  element  mortar  method  which  relies  on 
interface  variables  to  couple  the  subdomains,  while  the  second  discretization  is  a  finite  volume  method 
which  relies  on  geometrically  motivated  projections  to  couple  the  subdomains.  To  facilitate  comparison 
of  the  accuracy  of  the  two  methods  using  the  a-posteriori  enor  estimate,  the  finite  volume  method  is  cast 
as  a  mixed  finite  element  method  using  appropriate  quadrature.  The  a-posteriori  error  estimate  is  derived 
and  used  to  analyze  both  the  size  and  source  of  the  discretization  error  of  both  methods.  We  identify, 
through  numerical  examples,  cases  in  which  the  geometric  projections  are  the  dominant  source  of  error 
by  one  to  two  orders  of  magnitude.  While  this  effect  may  be  expected  in  examples  where  the  solution 
is  changing  rapidly  near  the  interface,  it  is  also  demonstrated  for  an  example  in  which  the  solution  is 
smooth,  and  nearly  one  dimensional  across  the  interface. 

Keywords:  mortar  methods,  a-posteriori  error  estimate,  coupled  elliptic  problems,  heterogeneous  domain 
decomposition,  geometric  coupling 


1.  Introduction 

An  important  class  of  multiphysics  problems  has  a  structure  in  which  one  physical  process  dominates 
in  one  subdomain  of  the  problem  domain,  while  a  second  physical  process  dominates  in  a  neighboring 
subdomain.  The  solutions  are  coupled  by  continuity  of  state  and  continuity  of  normal  flux  through  a 
shared  boundary  between  the  subdomains.  Examples  include  general  problems  of  the  heterogeneous 
domain  decomposition  type  (Quarteroni  et  al,  1992;  Gaiffe  et  ai,  2002;  Bemardi  et  al,  1994),  core¬ 
edge  plasma  simulations  of  a  tokamak  fusion  experiment  (Cary  et  ai,  2008,  2010),  and  conjugate  heat 
transfer  between  a  fluid  and  solid  object  (Estep  et  ai,  2008,  2009b,  2010). 
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In  such  situations,  it  is  common  to  encounter  significant  differences  in  scales  of  behavior  in  the  two 
subdomains.  This  in  turn  suggests  the  use  of  different  discretization  grids.  However,  this  introduces  the 
problem  of  interpreting  the  meaning  of  coupling  state  and  flux  values  through  the  common  boundary  in 
the  discretization,  since  exact  pointwise  matching  is  no  longer  possible. 

Confounding  this  issue  are  the  practical  difficulties  of  solving  the  large  linear  and  nonlinear  discrete 
systems  associated  with  computing  numerical  solutions  and  the  common  situation  in  which  two  dif¬ 
ferent  codes  are  used  to  solve  the  two  subdomain  problems.  These  difficulties  are  generally  tackled  by 
employing  some  form  of  iterative  approach  that  involves  sequential  solution  of  the  subdomain  problems. 
The  particular  properties  of  the  discretizations  used  for  each  component  problem,  the  choice  of  iterative 
solution  method,  and  high  performance  computational  considerations  all  have  a  large  impact  on  the  way 
in  which  state  and  flux  values  are  passed  across  the  common  interface. 

In  this  paper,  we  investigate  the  accuracy  of  two  approaches  to  computing  the  coupling  values  in 
the  situation  in  which  the  discretization  grids  in  the  two  subdomains  do  not  match  at  the  interface. 
The  analysis  is  carried  out  for  the  closely  related  mixed  finite  element  and  cell-centered  finite  volume 
methods.  The  two  approaches  are  (1)  the  mortar  element  approach  (Brezzi  &  Fortin,  1991;  Roberts 
&  Thomas,  1991;  Arbogast  et  ai,  2000;  Ben  Belgacem,  2000;  Arbogast  et  al.,  2007;  Ganis  &  Yotov, 
2009),  which  uses  a  rigorous  variational  formulation  to  define  a  weak  sense  of  coupling,  and  (2)  a  “geo¬ 
metric”  approach  that  employs  various  ad  hoc  extrapolation  and  averaging  methods.  The  use  of  mortar 
elements  is  proven  to  be  optimally  convergent  on  nonmatching  grids,  provided  the  finite  element  space 
used  for  the  interface  variables  consists  of  piecewise  polynomials  of  one  degree  higher  than  the  trace 
along  the  interface  of  the  finite  element  space  used  to  approximate  the  flux  within  the  subdomains  (Ar¬ 
bogast  el  al.,  2000).  Nonetheless,  while  mortar  elements  are  well  known  in  some  application  domains, 
e.g.,  flow  in  porous  media,  they  are  not  widely  employed  for  multiphysics  problems.  Rather,  various 
“geometric”  techniques  are  used  in  most  practical  settings,  especially  in  situations  in  which  one  or  more 
of  the  components  are  solved  with  legacy  “black  box”  codes.  This  second  approach  is  often  rationalized 
using  a  combination  of  ad  hoc  formal  stability  and/or  accuracy  arguments  combined  with  high  perfor¬ 
mance  computing  expediences.  Moreover,  in  the  situation  in  which  legacy  codes  are  used  to  solve  either 
component,  there  is  little  choice  because  of  the  very  considerable  investment  that  would  be  required  to 
replace  these  codes. 

We  are  not  arguing  for  or  against  either  mortar  elements  or  “geometric”  approaches.  Rather,  we  ad¬ 
dress  two  issues:  (1)  What  effect  do  these  coupling  approaches  have  on  accuracy  of  specified  quantities 
of  interest?  and  (2)  In  each  case,  quantify  the  relative  contributions  of  various  aspects  of  discretization 
to  the  error  in  the  computed  information.  The  tool  we  use  to  address  these  issues  is  an  adjoint-based 
a-posteriori  error  estimate  (Estep  et  al.,  2000;  Becker  &  Rannacher,  2001;  Giles  &  Suli,  2002;  Wheeler 
&  Yotov,  2005;  Estep  et  al.,  2009a;  Hansbro  &  Larson,  201 1).  This  goal-oriented  estimate  accurately 
quantifies  various  contributions  to  the  overall  error.  In  particular,  the  estimate  distinguishes  contribu¬ 
tions  specifically  arising  from  the  mis-matched  grids  and  the  way  in  which  the  coupled  information  is 
approximated.  We  identify,  through  numerical  examples,  cases  in  which  the  geometric  projections  are 
the  dominant  source  of  error  by  one  to  two  orders  of  magnitude. 

The  remainder  of  this  paper  is  organized  as  follows.  Section  two  introduces  the  continuous  problem 
and  the  details  of  the  two  discrete  methods.  Section  three  derives  the  a-posteriori  error  estimate.  Sec¬ 
tion  four  contains  the  numerical  experiments.  Section  five  discusses  computational  logistics  related  to 
iterative  solvers,  and  a  brief  conclusion  is  given  in  section  six. 
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2.  Definition  of  the  problem  and  discretization  methods 

We  define  the  coupled  problem  with  a  common  interface,  then  describe  the  finite  element  and  finite 
volume  discretizations.  We  employ  the  well  known  equivalence  between  finite  volume  methods  and  the 
mixed  finite  element  method  (Russell  &  Wheeler,  1983;  Weiser  &  Wheeler,  1988)  to  recast  everything 
in  the  finite  element  framework.  This  greatly  eases  the  derivation  of  a-posteriori  error  estimates  and 
provides  a  systematic  framework  for  describing  geometric  approaches  to  computing  coupling  values. 

2. 1  The  continuous  problem 

The  continuous  problem  (2.1)-(2.3)  consists  of  a  system  of  second  order  elliptic  partial  differential 
equations  (PDE)  in  two  spatial  dimensions.  The  system  is  posed  on  a  rectangular  domain  T2  consisting 
of  two  nonoverlapping  rectangular  subdomains,  on  the  left-hand  side  and  on  the  right-hand  side, 
that  share  a  common  interface  fj,  and  whose  union  forms  the  entire  domain,  as  shown  in  Fig.  1.  The 
unit  normal  vector  n  is  defined  to  point  from  left  to  right  on  fj,  and  is  an  outward  pointing  normal 
on  ft  =  dQi  \  Fi  and  Fr  =  BFIr  \  Fi.  For  simplicity  of  presentation,  we  assume  Dirichlet  boundary 
conditions  on  dQ,  the  external  boundaries  of  the  domain.  The  results  extend  to  problems  with  Neumann 
conditions  on  part  of  the  boundary  in  a  straightforward  way. 


Fig.  1.  Subdomains,  boundaries,  and  definition  of  normal  n  on  the  interface. 


For  a  source  function  /,  split  as  e  L?{T2i)  and  /r  €  L^{Qr),  and  boundary  data  g,  similarly  split 
as  gL  €  H^I~{Fl)  and  gR  €  H^I^{Fr),  the  coupled  system  is 


-bVp/,=0, 

(x,>’)  €  T2l, 

V  •  Mz.  =  h, 

{x,y)  e  Ql, 

(2.1) 

Pl  =  gz.1 

{x,y)  e  Fi, 

fl~'MR-|-VpR  =  0, 

(x,y)  €  nR, 

V-M/?  =//?, 

(x,y)  €  nR, 

(2.2) 

Pr  =  gRi 

(x,y)  €  Fr, 

^=Pl  =  Pr, 

(x,y)  eFf, 

(2.3) 

n  •  (mz.  -  Mr)  =  0, 

(x,y)  E  F/, 
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where  we  assume  that  the  diffusion  matrix,  a,  is  a  function  of  space  times  the  identity. 


D{x,y)  0 
0  D(x,y) 


(2.4) 


with  D  e  W  .“(fi)  and  min_D(.v,y)  ^  Dq  >  0,  so  a  is  invertible  and  uniformly  coercive.  Note  that  we 
(.r,y)€fl 

have  defined  ^  as  the  common  interface  pressure  in  (2.3). 


2.2  Mixed  finite  element  mortar  discretization 

The  mortar  finite  element  discretization  was  developed  precisely  for  the  situation  presented  by  dis¬ 
cretization  of  (2.1)-(2.3)  using  two  different  grids  in  the  two  different  subdomains.  We  assume  that 
each  subdomain  is  discretized  by  a  (logically)  rectangular  finite  element  grid.  Lagrange  multipliers  are 
introduced  on  the  interface  boundary  to  provide  a  weak  formulation  of  the  pressure  coupling  conditions. 
Since  the  grids  are  different  on  the  two  sides  of  the  interface,  the  Lagrange  multiplier  space  cannot  be 
the  normal  trace  of  the  velocity  space.  So,  we  introduce  a  mortar  finite  element  space  on  the  inter¬ 
face  (Arbogast  et  al.,  2000;  Bernard!  et  ai,  2005;  Arbogast  et  al.,  2007).  As  shown  in  Arbogast  et  al. 
(2000),  the  method  is  optimally  convergent  and  has  several  other  desirable  convergence  properties  if  the 
boundary  space  has  one  order  higher  approximability  than  the  normal  trace  of  the  velocity  space.  The 
same  order  of  convergence  is  obtained  for  both  continuous  or  discontinuous  piecewise  polynomials  in 
the  mortar  space.  In  our  discretization,  we  choose  the  interface  grid  that  has  one  cell  for  every  two  cells 
in  the  finer  of  the  two  subdomain  grids.  Fig.  2  shows  the  arrangement  for  a  5  x  5  grid  next  to  8  x  8  grid. 
(Note  that  our  convention  is  that  the  finer  grid  is  always  used  in  the  righthand  subdomain.) 


Fig.  2.  Example  grid  shown  separated  into  the  part  on  Qi,  f},  and  12/?  from  left  to  right. 


We  use  standard  L?  inner  product  notation,  i.e.,  for  functions  F  and  G  defined  on  Q,  split  as  above, 

{Fi,Gi)=  f  Fi(x,y)Gi{x,y)dxdy,  i  =  L,R, 

Jsii 

and  for  functions  defined  on  the  boundaries,  we  similarly  denote 

{F,G)ri  =  j^FGds,  i  =  L,l,R. 

The  mixed  finite  element  (mortar)  method  starts  with  the  following  continuous  weak  formulation.  Find 


ERROR  ESTIMATES  FOR  MORTAR  METHODS 


5  of  24 


Pi  eWi  =  Lr[£li),  Ui  e  Vi  =  H{d\v,Qi),  ^  eA=  i  =  L,R,  satisfying 

(fl“ ' «i, v/.)  -  (pi, V •  vz.)  +  (^ , n ■  vz.)r,  = -{gL,n-VL)rL, 

(V-ut.vfz.)  =  (/z.,wz,), 

{a~'uR,VR)-{pR,V-VR)-{^,n-VR)r,  =  -(gR,n- VR)r^, 

{V  ■Ur,wr)  =  {fR,WR),  (2.5) 

{n-{ui-UR),v)rt  =0, 
for  all  (n',-,v,,  v)  e  (V^-,  V/,  A),  i  =  L,R. 

To  discretize,  we  use  the  lowest  order  Raviart-Thomas  finite  element  space  (RTO),  in  which  the 
discrete  scalar  unknown  p''  is  approximated  as  a  constant  over  each  cell,  and  the  components  of  the 
discrete  vector  are  approximated  by  functions  that  are  piecewise  linear  in  one  spatial  dimension  and 
constant  in  the  other  (Bemardi  et  ctl.,  2005;  Estep  et  al.,  2009a).  The  discrete  interface  unknown,  is 
represented  by  piecewise  discontinuous  linears  on  the  interface  grid  cells  (Arbogast  et  al.,  2000,  2007). 
The  test  functions  in  the  discretization  of  the  weak  formulation  of  (2.5)  corresponding  to  vv,  v,  and  V 
are  restricted  to  these  same  spaces.  To  be  precise,  for  a  finite  element  partition  A  of  [a,b\,  and  for 
r  =  0, 1,2, ...,  9  =  — 1,0, 1,...,  we  define  the  piecewise  polynomial  space 

V  is  a  polynomial  of  degree  ^  r  on  each  subinterval  of  A  }. 

When  q  =  —  \  the  functions  are  discontinuous.  The  space  of  continuous  piecewise  bilinear  functions  is 
the  tensor  product  (A^)  0  (A,  ).  The  RTO  discrete  spaces  are 

W,.'’  =  .y/°i(A,,,)0.///®,(Av,*),  i  =  L,R, 

V'l  =  X  K”,(A..,,)0.^4'(A,,)),  i  =  L,R, 

A''  =  .y/l,(Ar,). 

The  mixed  finite  element  (mortar)  method  reads:  Compute  p*  €  Wf’,  u'-  €  V';,  €  A'’,  i  =  L,R, 

satisfying 

{a-'u'l,VL}-{p'l,V  ■VL)  +  {^'’,n-Vi)r,  =  -{gL,n-VL)rL, 

(V-M^,wz.)  =  (/z.,Wi), 

(a”‘«R,VR)-(pR,V-VR)-(^'',«-VR)r,  =  -{gR,n-VR)r^, 

=  (/zj,Wr),  (2.6) 

(/I  •(«'/- 4),  v)r,  =0, 

for  all  (vw/,  V,,  v)  €  V-",  A''),  i  =  L,R.  This  yields  a  discrete  system  of  the  form 


■  Ml  -Bl  0  0 

B[  0  0  0 

0  0  Mr  —Br 

0  0  Bl  0 

Cl  0  Cl  0 


(2.7) 


where  we  abuse  notation  to  let  ii^,  p'l,  and  denote  the  vector  of  nodal  values  for  the  finite  element 
functions. 
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2.3  Geometrically  coupled  finite  volume  discretization 

The  standard  formulation  of  the  finite  volume  method  eschews  a  variational  formulation  of  the  problem, 
so  there  is  no  natural  description  of  a  weak  imposition  of  the  coupling  conditions  in  that  formulation. 
Moreover,  the  standard  finite  volume  method  provides  approximation  values  of  p  only  at  cell  cen¬ 
ters  while  approximate  values  for  ii  along  cell  boundaries  are  obtained  by  differencing  the  p  values. 
These  characteristics  motivate  the  use  of  “geometric”  coupling  techniques  that  employ  a  combination 
of  extrapolation  and  averaging  to  provide  coupling  values  of  both  unknowns  along  the  interface.  The 
motivation  for  this  approach  is  reinforced  in  the  context  of  iterative  solution  of  the  coupled  problems, 
where  well  posed  problems  are  created  on  each  subdomain  using  interface  boundary  conditions  ob¬ 
tained  from  the  other  subdomain.  In  this  approach,  it  is  necessary  to  couple  the  coarser  side  using  state 
values  extrapolated  from  the  finer  side  solution,  while  the  finer  side  must  be  coupled  to  flux  values, 
which  are  themselves  differences  of  state  values,  extrapolated  from  the  coarser  solution.  Reversing  this 
arrangement  can  lead  to  a  singular  system. 


Fig.  3.  Extrapolation  to  the  interface.  Left:  Neumann 
values  on  the  interface  are  computed  by  linear  extrapola¬ 
tion  of  the  last  two  available  flux  values,  which  are  dif¬ 
ferences  of  state  values.  Right:  Dirichlet  values  on  the 
interface  are  computed  by  linear  extrapolation  of  the  last 
two  available  state  values. 


D  ± 


E  ch 


Fig.  4.  Averaging  orbroadcasting  of  extrapolated  values.  Left:  In 
the  case  of  constant  extrapolation,  the  last  available  state  or  flux 
value  is  simply  used  as  the  interface  value.  Right:  Weighted  av¬ 
eraging  of  state  and  flux  values  when  cell  widths  do  not  share  an 
integer  ratio. 


C)A 

oB 

C)C 


To  obtain  values  on  the  interface,  we  employ  either  linear  or  constant  extrapolation.  We  illustrate 
linear  extrapolation  in  Fig.  3.  We  compute  the  extrapolated  values  by  computing  a  linear  or  constant 
interpolant,  which  is  then  evaluated  at  the  interface  boundary.  We  denote  the  extrapolated  values  using 
the  operators  and  Pi-^r{p)}.  When  the  cells  on  either  side  of  the  interface  do  not  match,  then 

weighted  averaging  and  “broadcasting”  schemes  are  used  to  generate  values.  In  Fig.  4,  we  illustrate  the 
averaging  and  broadcasting  schemes  when  two  cells  on  the  right  match  one  cell  on  the  left.  The  state 
values  at  the  two  circle  locations  are  averaged  and  used  at  the  square  location.  The  flux  value  at  the 
square  location  is  “broadcast”  to  both  of  the  circle  locations.  When  the  cell  widths  on  the  coarse  and 
fine  side  of  the  interface  do  not  share  an  integer  ratio,  then  a  suitable  averaging  of  values  is  used.  For 
example,  in  the  2  cells  next  to  3  cells  arrangement  pictured  in  Fig.  4,  the  state  value  at  location  D  is  set 
equal  to  j  the  state  value  at  location  A  plus  j  the  state  value  at  location  B.  The  flux  value  at  location  A 
is  set  equal  to  the  flux  value  at  location  D,  while  the  flux  value  used  at  location  B  is  set  equal  to  half  the 
flux  value  at  D  plus  half  the  flux  value  at  E. 

We  formulate  the  finite  volume  method  as  an  RTO  mixed  finite  element  method  employing  a  special 
quadrature  formula,  following  Russell  &  Wheeler  (1983);  Weiser  &  Wheeler  (1988).  This  provides 
a  foundation  for  deriving  an  a-posteriori  error  analysis  for  the  finite  volume  scheme,  see  Estep  et  al. 
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(2009a).  The  version  of  (2.6)  equivalent  to  a  finite  volume  method  reads:  Compute  p'-  G  W-',  u!-  G  Vf , 
G  A^,  i  =  L,R,  satisfying 


{a  ■vl)  + {PR^L{p\),n  vL)ri  =  -{gL,n  vL)riM, 

=  {fL,WL), 

Vr)m,t  -  (Pr,  V  •  Vfi)  -  •  VR)r,  =  -(gR,n-  VR)ri,.M, 

(Vm^.vvr)  =  (/r.ivr),  (2.8) 

{{Pl->r{pl)  -  « ■  “r).  v)r;  =  0, 

for  all  (>v/,v/,  v)  G  (W/',  Vf ,  A^'),  i  =  L,R.  Here  we  employ  the  approximate  inner  product 


where  and  Tj.j  denote  the  midpoint  and  trapezoidal  quadrature  rules  in  the  x  and  y  directions  as 
indicated,  while  (•,•)/].«  denotes  the  midpoint  rule  for  /  =  L,R. 

This  yields  a  discrete  system  of  the  form 


■  Ml  -Bl  0  Qd  0  ■ 

[  4  1 

■  -Dl  ' 

B[  0  0  0  0 

Pl 

Pl 

Q  Q  Mr  —Br  Cr 

4 

= 

-Dr 

0  0  0  0 

4 

Fr 

0  Qn  cl  0  0 

[  I"  J 

0 

which  should  be  compared  to  (2.7). 

It  is  possible  to  eliminate  the  unknowns  (if,  i  =  L,R,  and  to  reduce  (2.9)  to  a  system  for  p'-  of  the 
form 


■  Ar 

Cd  ■ 

\4] 

■  Pl 

Cn 

.  Pr  . 

.  . 

(2.10) 


The  averaging  and  broadcasting  are  incorporated  into  the  “coupling  Dirichlet”  and  “coupling  Neumann” 
matrices  Co  and  C^.  This  is  the  same  system  that  is  constructed  by  using  a  finite  volume  approach 
directly. 

We  have  verified  through  numerical  experiments  that  the  p  component  of  the  solution  of  (2.9)  is 
identical  to  the  solution  of  (2. 10).  Furthermore,  the  a  component  of  the  solution  of  (2.9)  is  identical  to 
the  ii  values  obtained  by  differencing  the  solution  of  (2.10)  to  approximate  Vp  at  the  cell  boundaries 
and  evaluating  the  diffusivity  at  the  cell  boundaries.  The  ^  component  of  the  solution  of  (2.9)  has  no 
counterpart  in  the  solution  of  (2.10). 


3.  A-posteriori  error  analysis 

Our  goal  is  to  derive  an  a-posteriori  error  estimate  for  the  quantity  of  interest 

i^UL  >  Vui.  )  +  i^PL  >  Vrl  )  +  («UR  >  )  +  i^PR  <¥pR)  +  {e^,V^)rn  (3.1) 

where  ,  t/r„^ ,  ,  and  are  given  L“  functions  and  e^.-j  denotes  the  errors  in  the  corresponding 

variables.  We  define  the  generalized  Green’s  function  corresponding  to  these  functionals  using  the 
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adjoint  problem 


on  Dl, 

-^■<I>l  =  ¥pl 

on  Ql, 

(3.2) 

O 

II 

onfl, 

«  '<I>r-^Cr  =  ¥ur 

on  Qp, 

-^■4>R  =  ¥pr 

on  Qr, 

(3.3) 

o 

II 

on  Fr, 

onF/, 

(3.4) 

n(<l>L-<t>R)  =  ¥4 

on  fj. 

The  a-posteriori  error  estimates  explicitly  depend  on  and 


3. 1  Estimate  for  mortar  mixed  finite  element  method 

We  first  derive  an  estimate  for  the  mortar  finite  element  method  assuming  all  integrals  in  the  weak 
formulation  are  computed  exactly.  We  begin  by  substituting  (3.2)-(3.4)  for  the  various  i/^’s  in  (3.1)  and 
applying  the  divergence  theorem, 

,Vul)  +  i^PL  ^Vpl)  +  ,Vuh)  +  (^ph  .  V'w  )  +  (h  ’  )  r,  (3.5) 

=  {euL,a-'it>[_)  +  {V.eu^,^L)-{n-eui_,li)r,-{epL,y-^L) 

{eup,a~^<l>p)+{V-eup,Q)  +  (n-eu„,P)r,-{epp,y<t>R) 
+{e^,n-{<t>i-<t>p))rr 

Expanding  on  the  right  and  subtracting 

(a"  ‘  «L,  0i,)  -  (Pt ,  V  •  +  {.? , n  ■  0f,)r,  +  (gt,  n  •  0z,)n, 

+(v-«L,a)-(/L,CL) 

-  (pR,  V  ■  -  {^,n  <l>p)r,  +  {8H,n  tj>^)r^ 

+(^  ■  UR,Cp)  ~  (/r-  Ck) 

-{n{uL-UR),P)r,  =0, 

obtained  by  substituting  the  adjoint  solution  as  test  functions  into  the  forward  weak  form  (2.5),  gives 
(euL  ,¥ut)  +  i^PL  ,Vpl)  +  ,¥up)  +  i^PR .  V'p/, )  +  ,v^)r, 

=  V-  0  J  -  (gz.,n  •  -  (^^n  •  0f,)r, 

-(«'‘«r.0r)  +  (Pr:  V  •  0r)  -  (gR."  •  0R>r«  +  ^R)r, 

+(/r.Cr)-(V-<Cr) 

+  {n-{ut-u>’p),P)rr 


(3.6) 


4 


4 
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We  rewrite  this  as 

¥ut)  +  ¥pl)  +  +  iep„¥p,)  +  {e^ ,  ¥4)r,  (3.7) 

=.(7?in,0t)  +  (/?pi,Ct.)  +  (7fa«4R)  +  (/?p;iiCR)  + 

wherein  the  residuals  are  given  by 

Ru,  =  -a-'u’[~Vpl  R„,  =  -a-'u^,-Vpl 

RpL=fL-¥ul  /?p^  =  /r-V-4,  R^=n{u'l-u\). 

Note  that  the  divergence  theorem  implies 

(^az,4/.)  =  -(«"‘44z.)  +  {Pu^'  <I>l)  -  (4.«  ■  <t>L)dnL 

= -(«'‘ “/it 4/.)  +  (P£,V4z.)  -  (gt.n  4z.>ri -(>?'',«  •  0z.)n. 

Also  note  that  P  =  —  for  the  continuous  adjoint  solution,  but  P  is  distinct  from  and  for  the 

discrete  solution. 

Next,  we  use  Galerkin  orthogonality.  We  introduce  projection  operators  that  map  into  the  finite 
element  space  of  the  discrete  forward  solution: 

nl.L\nR)^v%,  z>' ■.C-{r,)^  a>\ 

The  actual  choice  of  projection  is  immaterial  for  the  estimate.  In  practice,  we  employ  a  combination  of 
restriction  and  averaging.  Without  quadrature,  Galerkin  orthogonality  for  (2.6)  is  expressed  as 

{Ru„n'l<t>i^)  +  {Rp^Ptl^L)  +  (R„„n*0R)  +  +  {R^,Z,,P)r,  =  0, 

and  subtracting  gives  the  following  result. 

Theorem  3.1  The  errors  for  the  mixed  finite  element  method  (2.6)  without  quadrature  satisfy 

i^PL’  Vpl)  +  (euL,  ¥uj  +  (epR^Wpp)  +  ieuR,Vu,)  +  ’  ¥^)r,  (3.8) 

=  (/?„, ,  0,  -  +  {Rr,,;l  -  li^L) 

wherein  the  quantities  on  the  right-hand  side  are  computable  provided  the  true  adjoint  solution  is  avail¬ 
able. 

In  practice,  we  employ  a  numerical  solution  of  the  adjoint  problem.  To  emphasize  this,  we  state  the 
following  corollary  that  involves  numerical  adjoint  quantities. 

Corollary  3.1  Provided  that  the  projection  operators  F^,  /7*,  17^,  and  Zr  are  bounded  in  Lr,  the 

errors  for  the  mixed  finite  element  method  (2.6)  without  quadrature  can  be  estimated  as 

i^PL'  ¥pl)  3"  (^Ui!  V^u/,)  3"  i^PR'  ¥pr)  3^  (^uri  ¥ur)  3"  )ri  (3.9) 

« {Ru„<i>1  -n'i<i>>[)  +  {Rp, ,  Cl  ) 

+  (/?„, ,  0^  -  0^)  +  (/?p„  Ci?  -p>-i;>>)  +  {R^,p>'-  ZrP\ , 

for  numerical  solutions  0^  w  0*,  ~  4>r  ~  ^r>  Cr  ~  Cr-  ^nd  P  w  P''.  In  this  approximation,  the 

errors  are  to  be  measured  in  the  L“-norm. 
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The  proof  follows  from  the  triangle  inequality  and  the  definition  of  the  operator  norm.  That  is,  the 
absolute  value  of  the  difference  between  the  right-hand  sides  of  (3.8)  and  (3.9)  is  bounded  by 

+  (l  +  ||n^||)||/?„J|2l|^l«-^^|b+(l+||/’j^||)IKJ|2||& 

+  (l  +  ||Z,,||)||/?^||2,r,||^-^''||2r,. 

In  order  to  obtain  accurate  estimates,  the  numerical  adjoint  solutions  must  be  sufficiently  accurate. 
Generally  this  is  satisfied  by  solving  the  adjoint  problems  either  using  a  higher  order  numerical  method 
or  using  a  mesh  sufficiently  refined  from  the  one  used  for  the  forward  discretization.  In  the  context  of 
finite  volume  discretizations,  the  second  approach  is  generally  easier  to  implement.  In  our  numerical 
examples  we  use  a  finer  grid,  and  the  accuracy  of  this  approach  is  illustrated  in  section  4.1. 

3.2  Estimate  for  finite  volume  methods  using  geometric  coupling 

3.2. 1  The  effect  of  cjiiadrature.  We  first  derive  an  estimate  for  the  mixed  finite  element  method  (2.6) 
with  quadrature,  which  can  be  applied,  say,  if  /,  g,  and  a  are  continuous.  With  quadrature,  Galerkin 
orthogonality  is  expressed  as 

+  (liuii,ri^^lf)Q+{Rpi^,Pg^ii:)Q  +  {R^,Zi,P)QS,  =0, 

where  we  use  the  subscript  Q  to  denote  the  approximate  inner  product  using  quadrature.  It  is  impor¬ 
tant  to  distinguish  residuals  associated  with  approximating  the  solution  spaces  using  finite  dimensional 
polynomial  spaces  from  residuals  associated  with  approximating  the  integrals  defining  the  variational 
formulation.  We  rewrite  Galerkin  orthogonality  as 

-QEuSnt^d-QEpM^L)-QEu,{n'^^R)-QEpM^R)-Q^iZ>'^)=Q, 

with 

QEuSHI^l)  = 

QEp,{P^;l)  =  iEp„liU)-iEp„PL^L)Q, 
qEuM<>r)  =  (EupM^^r)  -  {Ru,,n^R<t>R)Q, 

QEppiP^RQ)  =  iRpp,P'^^R)-{Rpp,lii;R)Q, 

QE^iZ^P)  =  {R^,ZRP}r,-{R^,Z,P)QSr 

This  gives  the  following  a-posteriori  estimate  for  the  mixed  finite  element  method  with  quadrature. 

Theorem  3.2  If  /,  g,  and  a  are  continuous,  then  the  errors  for  the  mixed  finite  element  method  (2.6) 
with  quadrature  satisfy 

i^PuVpt)  +  Vul)  +  i^PR^  Vm)  +  {H,V0r,  (3-10) 

=  (/?„,,  +  (Rp, , Cl  - Cl) 

+  (Ru„  ,<l>p-n^p<t>p)  +  {Rpp  ,^R-fiiR)  +  (R^,P-  ZhP)r, 

+  QEuSn^,<l>,)+QEp,iPta)  +  QEuM(t>R)  +  QEp,{l^p;R)+QE^iZ>’P). 
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Note  that  in  the  case  of  using  the  RTO  finite  element  space  and  the  midpoint-trapezoidal  quadrature 
rules  discussed  above,  the  mixed  finite  element  method  reduces  to  the  finite  volume  method  (Russell  & 
Wheeler,  1983;  Weiser  &  Wheeler,  1988;  Estep  et  ai,  2009a),  and  some  of  the  quadrature  error  terms 
are  zero.  These  terms  are  included  for  generality,  so  that  (3.10)  is  valid  for  other  combinations  of  finite 
element  spaces  and  quadratures. 

Note  that  in  practice,  we  implement  the  obvious  analog  of  Corollary  3.1,  which  now  requires  suffi¬ 
cient  smoothness  of  the  solution  to  obtain  sufficiently  accurate  quadrature  approximations. 

3.2.2  The  effect  of  geometric  coupling.  For  the  geometric  coupling  (2.8),  the  Galerkin  orthogonality 
becomes 


+  (R„^,]7^0;j)2  +  (Rp„,fftCs)o  +  (R4,Z/,^)o,r,  -  {n- u^- PL-^R(pl),z'’(i)Q,r,  =0. 

Defining 

+  {Pr^l{p'r)  -  n’l(l>^)r„ 

=  (R^,ZnP)r,  -  {R^.2u^)Qr, 

+  (riL-  u'l- PL^ii{p^i^),Z''P)Q  r,  -  {n-  ul-  PL.^R{p’[),Z'’P)r,, 
and  arguing  as  above  gives  the  following  result. 

Theorem  3.3  If  /,  g,  and  a  are  continuous,  then  the  error  for  the  mixed  geometric  finite  volume  method 
(2.8)  satisfies 


i^PL  >Vpl)  +  ,Vu,)  +  {epH  ,Vph)  +  +  (311) 

=  ,0,  -  n'ltt,,)  +  (Pp^Lip'k)  - +  (V,  Cz.  -  p'la) 

+  -  n'^<l>R)  +  {Pp,.  ^r-Pr^r) 

+  (R^,P-Z,P)r,  +  (n-u'i-PL^R{p'i),Z'‘P}r, 

+  +  Qppi.iP'L^L)  +  QEuAn^RtltR)  + 

Note  that  in  practice,  we  implement  the  obvious  analog  of  Corollary  3.1,  assuming  again  sufficient 
smoothness  of  the  solution  to  obtain  sufficiently  accurate  quadrature  approximations. 

4.  Numerical  investigations 

In  this  section,  we  use  the  a-posteriori  error  estimates  to  investigate  in  detail  the  accuracy  of  the  two 
approaches  to  coupling.  For  all  of  the  investigations,  the  coarser  subdomain  is  given  by  x  €  [—  1 , 1] 
and  y  £  [—2,0],  the  finer  subdomain  Qr  is  given  by  x  e  [—1, 1]  and  y  £  [0,2]  (see  Fig.  1),  and  the 
interface  F/  is  located  along  y  =  0.  (Note  that  here  the  bottom  subdomain  is  considered  as  being  “left” 
and  the  top  one  is  “right,”  in  conformance  to  our  convention  as  to  the  finer  subdomain.)  The  grids  are 
reported  as  ni  x  niL  for  the  left  domain  and  or  x  mR  for  the  right  domain,  where  corresponds  to 
the  number  of  cells  in  the  x-direction  (which  is  also  the  number  of  cells  along  the  interface),  and  otj.j 
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corresponds  to  the  number  of  cells  in  the  y-direction.  The  boundary  conditions  for  all  tests  are  Dirichlet. 
To  avoid  issues  arising  from  iterative  solution  of  the  discrete  system,  we  employ  direct  methods  to  find 
the  approximate  solution  to  within  machine  precision. 

The  quantity  of  interest  being  sought  is  specified  by  giving  the  adjoint  problem  data  y/py 

and  yr^.  The  adjoint  problem  is  solved  using  the  same  RTO  mixed  finite  element  method,  but  on  a  grid 
that  is  significantly  finer  than  that  of  the  forward  problem,  so  that  the  discretization  error  associated  with 
the  adjoint  solution  has  no  significant  effect  on  the  results. 

The  functions  chosen  for  the  source,  diffusivity,  and  adjoint  data  are  either  constants  or  Gaussian 
functions  of  the  form 


v/2? 


which  gives  a  localized  “ridge”  centered  at  y  =  b.  In  thecaseof  the  adjoint  data,  the  Gaussian  or  constant 
function  being  used  is  normalized  so  that  the  area  under  yr  is  equal  to  one.  The  parameter  K  is  non  zero 
only  in  the  case  of  diffusivity,  where  this  constant  is  added  to  the  Gaussian  to  prevent  the  diffusivity 
from  approaching  zero  anywhere  in  the  domain. 

In  the  tests,  we  report  values  for  the  terms  in  (3.10)  and  (3.1 1)  that  are  non  zero.  For  both  the  mixed 
finite  element  and  geometric  finite  volume  methods  the  following  five  terms  are  included: 


MFEx 

or 

GFV, 

MFE2 

or 

GFV2  =  (/?«„<?^-n*^*), 

MFEi 

or 

GFVi  =  (Rp„^'l-Pt^t). 

MFE4 

or 

GFV4  =  (Fp,.C«^-F''Cr), 

MFEs 

or 

GFV5  =  {R^,p'’-ZRp'’)rr 

In  the  geometric  finite  volume  case,  we  add  two  additional  terms  relating  to  the  geometric  projections 
and  two  additional  quadrature  terms: 


GFV6  =  {PR^Lip'’R)-^'',nnt(l>T)r„ 
GFV-,  =  {nul-PL->R{pl),Z’'^'')r„ 

GFVs=^<^u,int<l>l), 

GFV,  = 


We  note  that  the  first  five  expressions,  common  to  both  MFE  and  GFV,  are  often  similar  in  size.  As  a 
gross  measure  of  the  effect  of  geometric  progression  and  of  the  use  of  quadrature,  we  also  report  the 
two  ratios 


ratiOproj  — 


Lie  I  GFVj  I 

III  I  GFV,  r 


r3tiOqu3d 


LU  I  GFVt  I 

Lli\GFVi\' 


4. 1  Verification  of  a-posteriori  estimate  accuracy 

We  begin  with  a  problem  for  which  we  have  manufactured  the  known  solution 

p(x,y)=cos^y)cos(^^. 


(4.1) 
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The  diffusivity  a  is  equal  to  one  everywhere,  The  other  solution  components,  the  source  term  /,  and  the 
boundary  values  g  for  the  problem  follow  from  (4.1).  Since  we  know  the  true  solution,  we  can  compute 
the  exact  error  terms  (e,  y/)  on  the  left  in  (3.10)  and  (3.1 1)  directly  and  then  compare  to  estimates  of 
the  quantities  on  the  right  computed  using  a  numerical  solution  to  the  adjoint  problem.  In  this  situation, 
the  most  important  issue  for  the  accuracy  of  the  estimates  is  the  accuracy  of  the  approximate  adjoint 
solutions.  As  the  grid  for  the  adjoint  problem  is  refined,  the  estimates  become  more  accurate.  That  is, 
using  the  approximation  to  the  adjoint  problem,  the  estimated  quantities  or  JjGFV',-  becomes 

closer  to  their  true  value,  the  error  in  the  quantity  of  interest  MFE  £(e,  iff)  or  GFV  t//).  Tables  1 
and  2  show  this  using  coarse  and  fine  forward  solutions. 


Table  1.  The  forward  problem  with  solution  (4.1)  is  run  at  10  x  10  next  to  16  x  16.  The  adjoint  problem  is  run  at  several  grids 
to  show  how  the  sum  of  terms  approaches  the  direct  calculation  of  (e,  y/).  The  adjoint  data  components  y/„  and  y/p  are  constant 
everywhere  and  y/^  =  0. 


adj.  grid 

MFE 

£MFEi 

ratio 

GKV  £(?.  v) 

LOFVi 

ratio 

20x20:  32x32 

1.96E-3 

t.47£-3 

.749 

-1.00£-3 

-1.50E-3 

1.49 

40x40  :  64x64 

I.96E-3 

l84£-3 

.937 

-1.00£-3 

-l.l3£-3 

1.12 

80x80:  138x128 

l.96£-3 

1  93£-3 

984 

-l.00£-3 

-l.03£-3 

1.03 

160x160:  256x256 

1.96Z:-3 

1.96£:-3 

.996 

-l.OOE-3 

-l.Olf-3 

1.01 

Table  2.  The  forward  problem  with  solution  (4. 1 )  is  run  at  40  x  40  next  to  64  x  64.  The  adjoint  problem  is  run  at  several  grids 
to  show  how  the  sum  of  terms  approaches  the  direct  calculation  of  (e,  ly).  The  adjoint  dala  components  yfu  and  y/p  are  conslanl 
everywhere  and  =  0. 


adj.  grid 

MFE  Ue-,  V) 

ZMFE, 

ratio 

GFV  Eft,  V) 

ZCFVi 

ratio 

80x80  :  128x128 

l.23£-4 

923£-5 

.750 

-7.00£-3 

-l.OlE-4 

1.44 

160x160  :  256x256 

1.23E-4 

l.l5£-4 

.937 

-7.00£-5 

1.11 

4.2  Convergence 

To  compare  the  accuracy  of  the  various  approximations,  we  use  the  2-norms 


We  use  the  manufactured  solution  from  the  previous  section  {a  =  1  and  p  is  given  by  (4. 1)).  We  compare 
the  2-norm  errors  of  the  finite  element  and  geometric  finite  volume  methods  on  a  sequence  of  grids  in 
order  to  asses  the  convergence  rate.  The  coarsest  grid  is  10  x  10  next  to  16  x  16,  and  the  number  of  cells 
in  each  dimension  is  doubled  with  each  refinement. 

The  results  in  Tables  3-6  show  that  the  convergence  rate  for  the  geometric  finite  volume  deteriorates 
for  the  «  (,  Uy,  and  ^  components  when  the  number  of  cells  along  the  fine  side  of  the  interface  is  not  an 
integer  multiple  of  the  number  of  cells  along  the  coarse  side  of  the  interface.  When  the  test  is  repeated 
with  a  grid  starting  at  8  x  8  next  to  16  x  16,  the  convergence  rates  for  the  two  methods  are  equal.  The 
first  order  convergence  of  p  and  u  for  the  MFE  is  to  be  expected  (Arbogast  et  ai,  2000). 
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Table  3.  Convergence  of  solution  component  p,  indicating  a  rate  of  0(/i). 


grid 

MFE||Cp|| 

MFE  ratio 

GFV  llcpil 

GFV  ratio 

10x10 :  16x16 

1.20E-01 

N/A 

1.20E-01 

N/A 

20x20  :  32x32 

5.98E-02 

2.00 

5.98E-02 

2.00 

40x40  :  64x64 

2.99E-02 

2.00 

2.99E-02 

2.00 

80x80;  128x128 

1 ,49E-02 

2.00 

1.49E-02 

2.00 

160x160:  256x256 

7.47E-03 

2.00 

7.47E-03 

2.00 

Table  4.  Convergence  of  solution  component  u^,  indicating  a  rate  of  about  0(/i). 


grid 

MFE  j] 

MFE  ratio 

GFV  |k„.|| 

GFV  ratio 

10x10:  16x16 

8.49E-02 

N/A 

8.63E-02 

N/A 

20x20 ;  32x32 

4.21  E-02 

2.02 

4.26E-02 

2.02 

40x40 ;  64x64 

2.10E-02 

2.00 

2.14E-02 

1.99 

80x80:  128x128 

1.05  E-02 

2.00 

1.09E-02 

1.97 

160x160;  256x256 

5.25  E-03 

2.00 

5.63E-03 

1.93 

Table  5.  Convergence  of  solution  component  it,.,  indicating  a  rate  of  0(/i)  for  MFE  but  less  for  GFV. 


grid 

MFE  |k„..|| 

MFE  ratio 

GFV  |k„,.|| 

GFV  ratio 

10x10 ;  16x16 

8. 41  E-02 

N/A 

8.59E-02 

N/A 

20x20 ;  32x32 

4.20E-02 

2.00 

4.39E-02 

1.96 

40x40 ;  64x64 

2.10E-02 

2.00 

2.29E-02 

1.92 

80x80:  128x128 

1.05E-02 

2.00 

I.23E-02 

1.86 

160x160;  256x256 

5.25E-03 

2,00 

6.94E-03 

1.77 

Table  6.  Convergence  of  solution  component  indicating  a  rate  of  for  MFE  but  only  0(li)  for  GFV. 


grid 

MFE  Ikril 

MFE  ratio 

GFVikill 

GFV  ratio 

10x10:  16x16 

7.53e-03 

N/A 

6.11e-03 

N/A 

20x20  :  32x32 

1.89e-03 

3.99 

1.79e-03 

3.42 

40x40 ;  64x64 

4.72e-04 

4.00 

6.37e-04 

2.80 

80x80:  128x128 

l.l8e-04 

4.00 

2.77e-04 

2.30 

160x160:256x256 

2.95e-05 

4.00 

1.33e-04 

2.09 

4.3  Test  Case  I 

In  the  next  problem,  we  explore  accuracy  for  a  solution  that  is  not  changing  rapidly  near  the  interface. 
We  find  that  the  use  of  geometric  projections  does  not  lead  to  significant  effects  on  accuracy.  We  let 
the  diffusivity  a  be  one  everywhere  and  use  the  manufactured  solution  given  by  (4.1).  The  grid  for  the 
forward  problem  is  20  x  20  next  to  32  x  32.  The  adjoint  grid  is  80  x  80  next  to  128  x  128,  and  the 
adjoint  data  is  a  nonzero  constant  for  and  t//p,  while  =  0. 

We  list  the  error  contributions  in  Table  7.  For  the  geometric  approach,  we  list  results  for  both 
constant  and  linear  extrapolation.  The  results  show  that  the  projection  error  for  linear  extrapolation  is 
only  about  one  quarter  of  the  residual  error,  while  the  projection  error  for  constant  extrapolation  is  much 
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larger.  Fig.  5  shows  the  solution  components  for  the  finite  element  case.  The  geometric  finite  volume 
solutions  are  very  similar.  Fig.  6  shows  the  adjoint  solution  components. 


Tabic  7.  Error  terms  for  Case  I.  The  forward  grid  is  20  x  20  next  to  32  x  32.  The  adjoint  grid  is  80x80  next  to  128  x  128. 


term 

MFE 

GFV{imecir) 

GFV  [constant] 

1 

-l.6£-4 

-I.6E-4 

-I.5E-4 

2 

-6.  IE-5 

-6.1E-5 

-6.IE-5 

3 

4.9E-4 

,  4.9E-4 

4.9E-4 

4 

1.9E-4 

I.9E-4 

I.9E-4 

5 

4.2E-8 

-I.3E-6 

l.OE-5 

6 

{PR^L(lA)-^\n-nhl)r. 

N/A 

2.0E-4 

I.7E-4 

7 

N/A 

2.2E-5 

3.9E-3 

8 

^•^uAnUI) 

N/A 

-7.IE-4 

-7.0E-4 

9 

_ Q^uAh^rK) _ 

N/A 

-2.8E-4 

2.7E-4 

total 

4.6E-4 

-3.0E-4 

3.6E-3 

ratiOp,„i 

N/A 

.25 

4.5 

rUtiUquod 

N/A 

l.l 

l.l 

Ftc.  5.  Finite  element  solution  components  for  Case  I . 


C"  «^>x  P'' 


FlO.  6.  Adjoint  solution  components  for  Case  I .  Shown  are  plots  for  an  adjoint  solution  using  a  40  x  40  grid  next  to  64  x  64  grid. 
A  solution  on  a  finer  grid  is  used  to  compute  the  estimates. 


4.4  Test  Case  2 

The  next  test  problem  presents  a  more  difficult  solution  for  which  the  geometric  projection  error  is  by 
far  the  largest  source  of  error.  The  grid  is  40  x  40  next  to  64  x  64  and  the  boundary  conditions  are  g  =  0 
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on  both  subdomains.  Fig.  7  shows  profiles  of  the  source  and  diffusivity,  while  Fig.  8  shows  the  adjoint 
data.  ' 


.  Fig.  7.  Source  /  (left)  and  diffusivity  a  (right)  profiles  for  Test  Case  2.  The  plots  are  shown  in  one  dimension  since  the  source 
and  diffusivity  have  no  variation  in  the  .r-direction. 


Fig.  8.  Adjoint  data  profiles  for  Case  2.  The  plots  of  Vtu, .  V'n,  •  “"ri  Vp  shown  in  one  dimension  because  they  have  no  variation 
in  the  .v-direction. 


Because  the  source  is  large  but  the  diffusivity  is  small  along  the  interface,  the  solution  changes 
rapidly  near  this  region.  This  leads  to  relatively  large  errors  near  the  interface  for  the  geometric  finite 
volume  method.  When  the  adjoint  data  is  concentrated  near  the  interface,  the  relative  size  of  these  errors 
is  revealed.  Table  8  lists  the  error  terms.  For  this  particular  example  problem,  and  this  particular  error 
measure,  the  error  due  to  geometric  projection  is  nearly  eighty  times  the  total  error  associated  with  the 
residuals.  Fig.  9  shows  the  solution  comjxjnents  for  the  finite  element  case.  Fig.  10  shows  the  solution 
components  for  the  geometric  finite  volume  case,  and  Fig.  1 1  shows  the  adjoint  solution. 


'The  shapes  in  Fig.  7  and  8  are  based  on  a  normalized  gaussian  of  the  form  .  The  parameter  a  is  for  normalization  and 

is  set  to  a  =  The  parameter  b  determiness  the  location  of  the  peak  and  is  set  to  zero  to  coincide  with  the  interaface.  The 
parameter  c  determines  the  width  of  the  peak  and  is  set  to  c  =  .2.  In  the  case  of  diffusivity  the  dip  is  produced  using  the  function 
10.3-10*(2S^). 
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Table  8.  Error  terms  for  Case  2.  The  forward  grid  is  40  x  40  next  to  64  x  64,  The  adjoint  grid  is  160  x  160  next  to  256  x  256. 


term 

MFE 

GFV  {linear) 

GFV  (comumt) 

1 

I.6E-5 

1 .9E-5 

2.4E-5 

2 

-2.6E-5 

-2.6E-5 

-2.5E-5 

3 

-3.  IE-5 

-3.  IE-5 

-3. IE-5 

4 

4.6E-5 

4.6E-5 

4.6E-5 

5 

4.8E-8 

9.4E-6 

2.6E-5 

6 

N/A 

2.3E-3 

2.7E-3 

7 

N/A 

9.0E-4 

8.4E-3 

8 

QEu.WV,) 

N/A 

1.2E-3 

I.2E-3 

9 

QEu.mHi) 

N/A 

-2.4E-4 

-2.5E-4 

total 

5.  IE-6 

4.2E-3 

I.2E-2 

ratioproi 

N/A 

25 

73 

r3tiOyuaj 

N/A 

II 

9.7 

Fig.  10.  Geometric  finite  volume  solution  components  for  Case  2.  Zooming  in  reveals  that  h(|  is  discontinuous  across  the  interface. 
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<l>i!  /?'' 


Fig.  11.  Adjoint  solution  components  for  Case  2.  Shown  are  plots  for  an  adjoint  solution  using  a  40  x  40  grid  next  to  64  x  64 
grid.  A  solution  on  a  finer  grid  is  used  to  compute  the  estimates. 


4.5  Test  Case  3 

In  our  final  example,  we  examine  a  problem  that  places  only  one  cell  in  the  x-direction  in  one  of  the 
subdomains.  Such  a  grid  is  only  appropriate  if  the  solution  in  that  subdomain  is  essentially  one  dimen¬ 
sional,  and  varies  only  parallel  to  the  interface.  This  situation  arises  in  core-edge  coupling  in  a  tokamak 
fusion  reactor. 

We  construct  a  problem  with  a  solution  that  is  very  nearly  one  dimensional  in  one  subdomain,  and 
contains  variation  in  the  second  dimension  well  away  from  the  interface.  The  pressure  component  of 
the  solution  is 

pix,y)  =  cos  +0  3  sin{;rx) 

The  grid  is  1  x  32  next  to  32  x  32  and  the  boundary  conditions  are  provided  by  evaluating  the  known 
solution  at  the  outer  domain  boundaries.  The  source  for  the  problem  is  computed  by  substituting  the 
chosen  solution  into  the  PDE.  The  diffusivity  a  is  one  everywhere.  The  adjoint  data  is  concentrated  in 
the  finer  subdomain,  and  is  shown  in  Fig.  12. 


1-tanh  (2(1.5-y)) 


(4.2) 


:  'i 

1  1 
'  1 

j 

!  i  1 

!  '■  • 

i  ' '' 

1  \ 

1  J 

.  1  \ 

. j. 

;  .  .  .  .  ^  /v 

.1  i  V 

•>  -IJ  •!  1  11  1  fj  1 

-1  ^4  -»  <41  •  IJ  1  JJ  1 

-1  -U  -i  -#1  •  IJ  1  U 

n 

Fig.  12.  Adjoint  data  profiles  for  Case  3.  The  plots  of  igu,  ,  and  ipj,  are  shown  in  one  dimension  because  they  have  no 
variation  in  the  x-direction,  and  ip^  is  a  one  dimensional  function  defined  on  the  interface. 

Table  9  lists  the  error  terms.  For  this  example  problem,  the  contribution  due  to  geometric  projection 
with  linear  extrapolation  is  approximately  ten  times  the  total  contribution  associated  with  the  residuals, 
despite  the  fact  that  the  solution  is  changing  slowly  near  the  interface.  The  projection  contribution  is 
much  larger  if  constant  extrapwlation  is  used.  Fig.  13  shows  the  solution  components  for  the  finite 
element  case,  Fig.  14  shows  the  solution  components  for  the  geometric  finite  volume  case,  and  Fig.  15 
shows  the  adjoint  solution  components. 
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Fig.  15.  Adjoint  solution  components  for  Case  3.  These  plots  are  the  adjoint  solution  using  64  x  64  next  to  64  x  64  meshes.  The 
estimates  were  compuu'ng  using  a  finer  grid. 


5.  Iterative  solvers  and  coupling  strategies 

In  practice,  iterative  solution  of  the  coupled  system  is  often  employed.  The  specific  choice  of  solution 
method  is  often  constrained  by  certain  computational  logistics,  such  as  the  state  of  existing  codes  and 
data  structures.  We  briefly  discuss  some  aspects  of  iterative  solution.  The  primary  goal  is  to  show 
that  iterative  solution  strategies  applied  to  systems  like  (2.10)  can  also  be  applied  to  systems  like  (2.7) 
without  large  changes  to  the  computational  structure.  We  do  not  discuss  the  convergence  of  iterative 
solvers. 

5.1  Iteration  on  the  primary  variable 

A  common  iterative  technique  for  the  geometric  finite  volume  method  (2.10)  is  to  start  with  an  initial 
guess  {Pi,p^)  and  proceed  with  the  iteration 
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This  iteration  requires  only  the  inversion  of  A/,  and  A/j,  that  is,  only  single  domain  component  solves. 
The  application  of  Co  and  C/v  can  be  viewed  as  the  coupling  strategy,  in  which  information  is  swapped 
between  the  subdomains. 

It  is  possible  to  use  an  iteration  of  this  type  on  the  finite  element  system  (2.7)  as  well.  We  must  first 
reduce  to  a  system  in  p  by  a  preprocessing  procedure.  We  first  eliminate  Ui  and  ur,  which  results  in 
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which  we  write  succinctly  as 
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We  then  eliminate  ^  to  obtain 
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(5.4) 


System  (5.4)  has  the  same  structure  as  (2.10),  so  an  iteration  analogous  to  (5.1)  can  be  applied.  The 
stencil  within  the  diagonal  blocks  of  (5.4)  is  very  close,  but  not  identical,  to  the  stencil  of  a  single 
domain  discretization.  The  difference  occurs  only  in  the  stencil  corresponding  to  cells  touching  the 
interface. 

In  some  cases,  e.g.,  the  use  of  black  box  single  domain  solvers,  it  is  necessary  to  construct  a  system 
in  which  the  diagonal  blocks  correspond  exactly  to  single  domain  discretizations.  If  this  is  the  case,  the 
strategy  of  “discretization  consistent  interface  conditions”  provides  a  partial  solution.  In  this  strategy, 
the  diagonal  blocks  are  single  domain  discretizations,  just  as  in  (2.10).  The  off  diagonal  blocks  are 
populated  by  writing  down  both  the  Dirichlet  and  Neumann  boundary  condition  equations  for  every 
cell  touching  the  interface,  rearranging  those  equations  to  isolate  the  boundary  value  terms  and  setting 
those  terms  equal  to  each  other  across  the  interface.  If  the  cell  ratio  along  the  interface  is  integer, 
such  as  4  next  to  8,  the  resulting  system  is  algebraically  equivalent  to  (5.4).  If  the  cell  ratio  is  not  an 
integer  ratio,  such  as  5  next  to  8,  the  equality  of  boundary  value  terms  across  the  interface  can  only  be 
enforced  approximately,  and  the  resulting  system  is  not  exactly  equivalent  to  (5.4).  While  a  complete 
discussion  of  the  implementation  of  discretization  consistent  interface  conditions  is  beyond  the  scope 
of  this  paper,  it  is  worth  consideration  as  an  alternative  to  the  full  mortar  method  in  cases  where  the 
computational  structure  is  constrained  by  black  box  single  domain  solvers  in  combination  with  iteration 
on  the  primary  variables.  The  concept  of  discretization  consistent  interface  conditions  is  similar  to 
strategies  employed  in  Farhat  et  al.  (1998)  and  Edwards  &  Rogers  (1998).  We  should  remark  that  the 
former  paper  recommended  against  mortar  methods  for  the  fluid-structure  interaction  problem,  due  to 
the  lack  of  theory  on  optimal  convergence  and  a  need  to  invert  a  large  interface  matrix.  However,  for 
the  problem  considered  in  this  paper,  the  mortar  method  does  achieve  optimal  convergence.  Moreover, 
we  presented  several  compulational  strategies  that  do  not  require  inversion  of  an  interface  matrix. 


5.2  Iteration  on  interface  variables 

An  alternative  iterative  strategy  (Glowinski  &  Wheeler,  1988)  uses  the  interface  variables  as  the  primary 
variables.  If  we  combine  the  u  and  p  variables  into  the  symbol  then  system  (2.7)  can  be  written  as 


’  Wl  ' 

■  ■ 

0  st/r  “^R 

Wr 

= 

S^R 

1 

o 

_ 1 

.  ^  . 

0 

We  eliminate, yr  as 

=  i  =  L,R, 

which  gives  the  following  system  for 
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If  a  Krylov  method  is  applied  to  system  (5.6),  then  only  matrix  vector  products  involving  the  matrix 
on  the  left  are  required.  Since  this  matrix  contains  ,0^“'  and  obtaining  a  matrix  vector  product 
amounts  to  performing  single  domain  component  solves.  Once  ^  is  obtained,  \j/  is  recovered  as  above. 
In  the  setting  of  geometric  coupling,  we  rewrite  the  geometric  finite  volume  system  as 

'  Al  0  Ud  0  ■ 

0  Ayj  0  Ufj 

0  Ed  -I  0 

En  0  0-1 

where  Ai,  andA^  are  single  domain  finite  volume  systems,  and  the  coupling  strategy  by  which  Dirichlet 
(D)  and  Neumann  (A^)  data  is  provided  by  the  opposite  subdomain  is  defined  by 

EnPl  =  Al  and  EdPr  —  D. 

Eliminating  D  and  N  from  system  (5.7)  gives 
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which  is  identical  to  (2.10).  If  instead  we  eliminate  pi  and  pR,  the  system  (5.7)  becomes 

/  EdA-^'Un 
EnAI'Ud  I 

which  allows  for  an  iteration  of  the  form  of  (5.1)  on  the  values  D  and  N,  from  which  the  primary 
variables  can  be  recovered.  Solving  (5.8)  by  iteration  is  analogous  to  solving  (5.6)  by  iteration,  and  both 
require  only  component  solves. 

6.  Conclusion 

The  geometric  finite  volume  method  (2.10)  is  often  used  to  discretize  problems  on  mismatched  grids. 
Using  the  fact  that  the  finite  volume  method  is  equivalent  to  the  mixed  finite  element  method  with  a 
certain  quadrature,  we  have  cast  (2.10)  in  a  form  analogous  to  a  mixed  finite  element  mortar  method. 
Doing  so  allows  us  to  directly  compare  the  two  discretizations  with  an  a-posteriori  error  estimate.  We 
have  shown  with  numerical  examples  that  while  the  geometric  finite  volume  method  performs  well  in 
some  cases,  there  are  cases  in  which  the  performance  deteriorates  dramatically  relative  to  the  mixed 
finite  element  mortar  method.  Furthermore,  such  cases  are  not  limited  to  problems  in  which  the  solution 
is  changing  rapidly  near  the  interface.  The  deterioration  was  shown  to  be  due  mainly  to  incorrect  transfer 
of  information  (or  projection  error)  across  the  interface.  Finally,  we  have  shown  that  the  mixed  finite 
element  mortar  method  can  be  paired  with  iterative  solvers  in  such  a  way  that  it  can  be  viewed  as  an 
alternative  coupling  strategy,  requiring  only  single  domain  component  solves  at  each  iteration. 
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Abstract 

This  paper  is  concerned  with  the  accurate  computationai  error  estimation  of  numerical  solutions  of 
multi-scale,  multi-physics  systems  of  reaction-diffusion  equations.  Such  systems  can  present  signifi¬ 
cantly  different  temporal  and  spatial  scales  within  the  components  of  the  model,  indicating  the  use  of 
independent  discretizations  for  different  components.  However,  multi-discretization  can  have  signifi¬ 
cant  effects  on  accuracy  and  stability.  We  perform  an  adjoint-based  analysis  to  derive  asymptotically 
accurate  a  posteriori  error  estimates  for  a  user-defined  quantity  of  interest.  These  estimates  account  for 
leading  order  contributions  to  the  error  arising  from  numerical  solution  of  each  component,  an  error 
due  to  incomplete  iteration,  an  error  due  to  linearization,  and  for  errors  arising  due  to  the  projection 
of  solution  components  between  different  spatial  meshes.  Several  numerical  examples  with  various 
settings  are  given  to  demonstrate  the  performance  of  the  error  estimators. 

Keywords:  reaction-diffusion,  adjoint  operator,  a  posteriori  estimates,  discontinuous  Galerkin  method, 
iterative  method,  multirate  method,  multi-scale  discretization,  operator  decomposition 


1.  Introduction 

This  paper  is  concerned  with  the  accurate  computational  error  estimation  of  numerical  solutions 
of  multi-scale,  multi-physics  systems  of  reaction-diffusion  equations.  The  components  of  solutions  of 
such  multi-scale,  multi-physics  models  typically  exhibit  spatial  and  temporal  behavior  occurring  over 
a  significant  range  of  scales.  For  example,  consider  the  well-known  Brusselator  model  for  chemical 
dynamics  [25,  1).  This  is  a  system  of  reaction-diffusion  equations  whose  separate  components  can 
behave  over  different  spatial  and  temporal  scales  for  particular  choices  of  parameters.  The  model  is 

iii  -ciAui  =a-{p  +  1)mi -i- MjU2, 

CU2-e2fSU2  =  Pui  -  u\u2, 

ui(x,  t)  =  a,  U2{x,  t)  =  pia, 

Ui(x:,0)  =  Mi.oU),  U2(x,0)  =  M2,o(x), 


X e  n cK^,  r  >  0, 
X  e  n,  r  >  0, 

X  e  dfl,  t>0, 
xeG, 
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where  Ui  and  «2  are  concentrations  of  species  1  and  2,  respectively.  We  assume  that  0  <  co  <  fi. 
i  =  1,2,  for  some  positive  constant  co-  The  solutions  are  multi-scale  in  time  and  space  for  a  wide 
range  of  parameter  values.  In  Fig.  I  we  show  the  solution  at  t  =  I.O  corresponding  to  a  -  2,  p  -  5.45, 
Cl  =0.008,  £2  =  0.08,  c-20  and  initial  conditions  ui,oW  =  a-i-0.Isin(;rjCi)sin(;rjC2)  and  U2,oW  =  pi a  + 
0.Isin(;rjci)sin(7rjC2).  We  plot  a  cross  section  of  the  numerical  solution  at  JC2  =  0.25  in  Fig.  2.  There  are 
sharp  spatial  gradients  for  the  component  Ui,  while  112  shows  relatively  less  spatial  variation,  suggesting 
that  we  might  use  an  relatively  finer  mesh  to  resolve  mi.  The  time  evolution  of  the  solution  at  the  point 
X  =  (0.25,0.25)  is  also  shown  in  Fig.  2  and  indicates  the  multirate  nature  of  the  solutions  in  which  Ui  is 
a  faster  component  than  U2  and  requires  relatively  fine  time  steps  for  accurate  resolution. 


(a)  (b) 

Figure  1:  Brusselator:  Color  contour  plots  of  the  solution  at  T  =  1.0.  (a)  UiCjc).  (b)  ii2{x). 


Figure  2:  Brusselator.  (a)  Spatial  cross  section  of  the  solution  at  X2  =  0.25  and  T  =  I.O.  (b)  Temporal 
cross  section  of  the  solution  at  x  =  (0.25,0.25). 

In  practical  situations,  the  error  of  approximate  solutions  of  multi-scale,  multi-physics  evolution 
models  is  always  significant.  Simply  providing  an  a  priori  analysis  of  convergence  and  an  assertion 
that  the  error  is  small  for  sufficiently  refined  discretizations  that  cannot  be  achieved  in  practice  is  in¬ 
adequate  for  scientific  purposes.  Hence,  application  of  numerical  solution  to  predictive  science  and 
engineering  applications  requires  accurate  estimation  of  information  computed  from  numericcil  solu¬ 
tion  as  part  of  the  overall  uncertainty  quantification  critical  to  scientific  and  engineering  needs. 

For  multi-scale  problems,  the  demands  of  computational  efficiency  (or  simple  necessity)  suggests  a 
multi-discretization  approach  that  involves  solving  the  distinct  components  of  a  multi-physics  model 
using  independent  meshes  and  time  steps  chosen  to  resolve  behavior  on  the  pertinent  scales.  A  multi- 
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discretization  strategy  often  has  significant  effects  on  the  accuracy  and  stability  of  the  numerical  solu¬ 
tion.  Indeed,  such  multi-discretization  methods  fall  into  the  general  class  of  multi-scale  operator  de¬ 
composition  methods  [11],  that  typically  employ  some  form  of  projection  to  link  solutions  computed  on 
different  spatial  and  temporal  meshes  and  necessarily  "synchronize”  solutions  that  have  been  decou¬ 
pled  during  an  iterative  solution  process.  Since  these  practices  can  have  a  complex  effect  on  accuracy 
and  stability,  there  has  been  a  steady  development  of  a  posteriori  error  estimates  for  a  wide  range  of 
multi-scale  operator  decomposition  methods  in  recent  years  113,  12,  16,  18,  6,  7,  22,  21,  23]  extending 
earlier  work  on  a  posteriori  error  analysis  employing  computable  residuals  and  adjoint  problems,  see 
e.g.  (10,  8,  9,  15,  19,  5,  3,  4].  While  the  primary  purpose  of  such  estimates  is  to  quantify  the  contribu¬ 
tions  of  various  sources  of  discretization  error  on  accuracy  and  stability,  the  estimates  can  also  provide 
guidance  as  to  the  choice  of  numerical  parameters  needed  to  obtain  a  desired  accuracy. 

The  analysis  of  multi-discretization  numerical  methods  for  multi-scale  systems  of  partial  differential 
equations  in  this  paper  extends  earlier  results  for  multi-rate  time  integration  schemes  for  initial  value 
problems  for  ordinary  differential  equations  in  [14].  For  simplicity,  we  consider  a  system  comprised  of 
two  reaction-diffusion  equations:  Find  u  =  {u\  that  satisfies 


ui-V-(eiVui)  =  fi{ui,U2), 
«2  -  V-leaVua)  =  /afni.Mz). 

Uiix,  t)  =  0, 
iq-U,0)  =  g/U), 


(jc,  t)  eflx  (0,  T] , 

U,  t)  e  Q  X  (0,  T], 

{X,  t)£dnx[0,T], !  =  1,2, 
xeCl,i  =  1,2, 


(2) 


where  Q  is  a  convex  polygonal  domain  with  boundary  dCl,  (/]•]  are  differentiable  functions  of  their  ar¬ 
guments,  {e,  }  and  {g,}  are  smooth  functions  in  O,  and  there  is  a  constant  eo>0  such  that  e/  >  Cq  >  9  on 
Q.  Finally,  we  also  assume  that 

/f(0)  =  0,  j  =  l,2.  (3) 

The  latter  assumption  is  used  to  define  the  adjoint  problems  employed  for  the  a  posteriori  error  analysis 
carried  out  in  Sec.  4.  The  ideas  and  results  extend  to  systems  consisting  of  more  than  two  equations  in  a 
straightforward  way.  Condition  (3)  can  also  be  generalized,  see  Sec.  4.  Finally,  neglecting  the  vastly  more 
difficult  questions  of  existence,  uniqueness,  rmd  regularity  for  the  problem,  the  analysis  also  extends  to 
problems  with  nonlinear  diffusion  constants,  and  we  show  the  formal  result  in  Sec.  7. 

Whenever  appropriate,  we  write  the  differential  equations  in  a  compact  form 


M- V-(cVm)  =  /(m), 

where  e  =  diag(ei,e2).  Vu  =  [Vmi  Vm2]^.  and  f{u)  =  [/i(m)  /zln)]^.  The  diffusion  coefficients,  Ci  and 
€2,  and  reaction  terms  fi  and  /2  may  induce  different  spatial  and  temporal  properties  for  ui  and  112. 
We  adopt  a  multi-discretization  approach  in  which  each  component  model  is  solved  on  its  own  scale. 
In  order  to  facilitate  this  approach,  we  compute  the  solution  using  a  common  iterative  approach  in 
which  each  component  model  is  solved  while  fixing  the  other  component  solutions.  The  individual 
component  solves  are  synchronized  by  exchanging  information  at  designated  “synchronization"  times. 
At  each  synchronization  time,  component  exchanges  are  iterated  a  specified  number  of  times  before 
the  solution  proceeds  to  the  next  synchronization  time. 

In  this  paper,  we  derive  accurate  a  posteriori  error  estimates  for  a  quantity  of  interest  obtained  from 
a  numerical  solution  computed  using  the  iterative  multi-discretization  scheme.  The  estimates  account 
for  leading  order  contributions  to  the  error  arising  from  numerical  solution  of  each  component,  multi¬ 
discretization,  and  iterative  solution.  The  estimates  quantify  the  relative  size  of  the  various  contribu¬ 
tions  to  the  error.  We  demonstrate  the  accuracy  of  the  estimates  on  a  variety  of  examples. 

The  rest  of  the  paper  is  organized  as  follows.  In  Sec.  2,  we  formulate  an  iterative  multi-discretization 
Galerkin  finite  element  method  for  (2).  In  Sec.  3,  we  formulate  an  analytic  version  of  (2)  that  we  use  for 
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the  purpose  of  analysis.  We  present  the  first  results  of  an  analysis  for  the  multi-discretization  solution 
method  in  Sec.  4  followed  by  numerical  examples  in  Sec.  5.  In  Sec.  6,  we  expand  the  analysis  to  include 
the  effects  of  using  different  space  meshes  for  the  two  components.  We  also  give  numerical  results  for 
the  Brusselator  problem  in  this  section.  Finally,  in  Sec.  7  we  consider  the  analysis  for  systems  in  which 
the  diffusion  coefficient  may  depend  on  the  solution. 

2.  An  iterative  multi-discretization  Galerldn  finite  element  method 

In  Alg.  1  we  formulate  the  iterative  multi-discretization  Galerkin  finite  element  method  for  (2). 
We  first  discretize  [0,  T]  into  0  =  to  <  ti  <  t2  <■■■<  -  T  with  time  steps  {Ar„  =  f„-  Af  = 

ntaxi<„<yv{Af„}  and  time  intervals  =  [r„_i,r„l.  We  think  of  (t,,}  as  synchronization  times  during 
which  information  between  the  two  component  solves  interior  to  the  nodes  is  exchanged  iteratively. 
To  each  t,„  we  assign  a  positive  integer  M,t  which  is  the  number  of  iterations  to  be  used  when  synchro¬ 
nizing  the  fast  and  slow  components. 

To  solve  the  components  over  each  synchronization  interval,  we  divide  the  intervals  {/„}  into  a  num¬ 
ber  of  smaller  time  steps.  We  let  L/,„,  i  =  1,2  be  two  positive  integers,  where  Li,„  denotes  the  number 
of  time  steps  used  to  solve  the  subsystem  1  and  L2.n  the  number  of  steps  used  for  subsystem  2  on  each 
synchronization  interval.  Without  loss  of  generality,  we  assume  Z,i,n  =  dnL2,n  for  some  positive  integer 
d„,  i.e.,  Li,„  is  divisible  by  L2,n-  We  denote  time  steps  for  each  component  in  the  Galerkin  formulation 
by  As,,„  =  AtnlLi^n,  with  As,  =  maxis„</v{As,;„}.  We  use  an  extension  of  the  discontinuous  Galerkin 
method  (15).  The  method  naturally  extends  to  the  continuous  Galerkin  method  [15]. 

To  construct  the  finite  dimensional  spaces,  we  first  discretize  G  into  triangulations  57,,,  where  h,- 
denotes  the  maximum  diameter  of  the  elements  of  5/,,,  i  =  1,2,  i.e.,  each  equation  has  different  trian¬ 
gulation.  Each  of  these  triangulations  is  arranged  in  such  a  way  that  the  union  of  the  elements  of  5),,  is 
n,  and  the  intersection  of  any  two  elements  is  either  a  common  edge,  node,  or  is  empty. 

The  approximations  are  polynomials  in  time  and  continuous  piecewise  polynomials  in  space  on 
each  space-time  slab  =  G  x  for  I  =  I,--  -  ,L|,„  and  S^.n  =  G  x  /j.,,,  for  A:  =  !,•••  ,L2,n-  Here  I  in  = 
[r„_i  +  (/-!) Asi,„,  r„_i-(-/Asi,„]  and  =  [f„_i  +  (A:-l)As2,/j,  t/,-i  -(-A'As2,nl  are  the  smaller  time  intervals. 

In  space,  we  let  V^.  c  denote  the  space  of  continuous  piecewise  polynomial  functions  v{x)  e 

K  defined  on  57,,.  (For  simplicity  we  confine  our  attention  to  problems  with  homogeneous  Dirichlet 
boundary  conditions).  On  each  slab,  we  define 

^I'n  =  L  "i  W’  (^.  t)  e  S;,„ 

I  1=0 

f 

t)  ■  Mx,  t)  =  Y,  yvj(x),  Vj  e  V7,,,  (x,  t)  G  Sk.n 

I  1=0 

We  denote  the  jump  across  tn  by  [tt'j,,  =  Wn  -  wZ,  where  =  lim._,t  w[s).  We  let  ni-,2  ^  , 

02-1  :  projections  between  the  two  spaces.  The  iterative  discontinuous  Galerkin 

dG(q)  finite  element  approximation  is  written  down  in  Alg.  1.  In  the  algorithm  ,  e 

X  are  the  finite  element  solutions,  defined  locally  on  time  intervals  I^n  and  l^^n-  The  notation 
(a,  b)  denotes  the  I?  inner  product,  or  simply  the  spatial  integral,  J^abdx. 

3.  An  analytic  iterative  method 

The  approach  to  the  a  posteriori  analysis  of  the  multi-discretization  finite  element  approximation 
in  Alg.  1  we  use  in  this  paper  starts  with  defining  an  iterative  method  to  determine  an  analytic  solution 
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Algorithm  1  Iterative  multi-discretization  Galerkin  finite  element  method 

Set !/(«»)(.,  r-)  =  H(.,  to) 
for  n  =  1  to  AT  do 

for  m  =  1  to  M„  do 

for  /  =  1  to  Li,,i  do 

Compute  t/l"'*  e  satisfying 

h.n  ' 

for  all  V  e  W^' 

l,n 

end  for 

for  1:  =  1  to  L2,n  do 

Compute  1/2”''  e  satisfying 

f  [  (e2Vf/]"’>.Vz)dr  +  (lf/f’>)it_i.„,Z+_,)=0  (5) 

J  Ik.n  '  •'  lt,n  ^ 

for  all  Z  e 
end  for 
end  for 

end  for _ _ _ 

of  (2)  obtained  via  a  sequence  of  functions  that  map  the  time  intervals  to  the  Banach  space 

X  =  L^{Cl),  i.e.,  u|"‘’(f) :  f,i]  x  X  —  X  for  i  =  1,2.  The  iterative  method  defining  is  given  in 

Alg.  2. 

The  accuracy  of  the  computational  error  estimate  derived  below  assumes  that  the  analytic  iteration 
has  converged  to  a  sufficient  extent  and  the  discretization  error  is  sufficiently  small.  The  following 
assumptions  provide  sufficient  general  conditions  to  guarantee  convergence  of  to  u,  ,  i  =  1,2: 

Assumption  A.I.  Assume  that  f(t,  u) :  [fn-i,  fnl  xXxX  —  XxX  is  uniformly  Lipschitz  continuous  with 
constant  L,  i.e. 

Wfit.u)-  f(.t,v)\\xxX  ^  L\\ii-  vWxxx  Vf>0.  (8) 

Similarly,  we  assume  that  f'{u)  is  uniformly  Lipschitz  continuous  with  constant  L' .’ 

Assumption  A.2.  Let  M  be  the  bound  on  the  semigroup  G  associated  with  (2)  (defined  in  the  Appendix). 
We  assume  that  the  time  steps  Atn  satisfy  the  inequality. 

MLAr„exp(MLAr„)  <  1  (9) 

The  convergence  proof  is  given  in  the  Appendix.  We  note  that  these  are  sufficient  conditions  to 
guarantee  convergence  of  the  iteration.  They  are  not  necessary  and  the  iteration  may  converge  in  spe¬ 
cific  cases  without  satisfying  these  assumptions.  Our  a  posteriori  analysis  assumes  the  iteration  is  con¬ 
vergent  and  employs  the  Lipschitz  assumptions,  but  does  not  specifically  depend  on  the  bound  on  the 
semigroup. 


Algorithm  2  Analytic  iterative  method 
for  n  =  1  to  N  do 

Set  f„-i) 

for  m  =  1  to  Mn  do 

Compute  u^^\x,  f)  in  Q  x  /„  satisfying 


(x,  t)  e  n  X  /„, 

i/,”'*  (x,  t)  =  0, 

(x,  0  efiG  X 

u\'”\x,  t„_i)  =  u|^'-''(X,  t„_i). 

xe  Q. 

Compute  11^2 '^{x,  0  in  fl  x  ]„  satisfying 


ti'”'’  -  y  •  (£2  Vnf  >)  =  f2{u\"’\ 

^  i4'"’(jc,r)=0, 

i4"'’U,  t„-i)  =  u[^^''-'\x, 


[x,  t)  e  Q  X 
(x,  t)  e  dQ  X 
xeCI. 


(6) 


(7) 


end  for 
end  for 


The  motivation  for  introducing  the  analytic  iterative  solution  method  is  the  realization  that  the  iter¬ 
ative  multi-discredzation  Galerkin  finite  element  method  in  Alg.  1  is  a  consistent  finite  element  space- 
time  discretization  of  Alg.  2.  In  particular,  in  (4)  and  (5)  we  have  chosen  piecewise  space-time  polyno¬ 
mials  that  solve  the  weak  or  variational  formulation  of  (6)  and  (7)  respectively.  The  variational  formula¬ 
tion  is  obtained  by  multiplying  each  (6)  and  (7)  by  appropriate  test  functions,  integrating  over  space  and 
time,  and  using  Green’s  formula  on  the  elliptic  part.  In  practice,  we  evaluate  the  finite  element  function 
using  quadrature  to  approximate  the  associated  integral,  which  results  in  a  set  of  discrete  equations. 

4.  A  posteriori  anedysis  of  the  iterative  multi-discretization  Galerkin  finite  element  method 

We  derive  computational  a  posteriori  error  estimates  based  on  variational  analysis,  residuals  of  the 
finite  element  approximation,  and  the  generalized  Green’s  function  solving  the  adjoint  problem  [8, 10,  9, 
15, 19,  5, 3, 11,4].  We  first  develop  the  analysis  assuming  the  same  spatial  meshes  for  both  components. 
We  relax  this  restriction  in  Sec.  6  where  we  include  the  effect  of  projection  between  different  spatial 
meshes. 

A  key  feature  of  the  analysis  is  the  realization  that  the  iterative  multi-discretization  approximation 
is  naturally  associated  with  a  different  adjoint  operator  than  that  for  the  original  problem.  For  this  rea¬ 
son,  we  use  a  different  linearization  than  commonly  employed  for  nonlinear  problems  [  13] .  We  assume 
that  the  operators  for  the  original  problem  and  the  analytic  operator  decomposition  version  share  a 
common  solution,  and  use  that  as  a  linearization  point  for  determining  the  stability  properties  of  so¬ 
lutions  in  the  neighborhood  of  the  linearization  point.  The  simplest  example  is  to  assume  a  common 
steady-state  solution  such  as  0,  which  is  guaranteed  by  the  homogeneity  assumption  (3),  i.e.,  /(O)  =  0. 
This  assumption  is  employed  in  many  standtu'd  analyses  of  the  model  (2)  and  it  is  satisfied  in  a  great 
many  cases.  The  condition  can  be  generalized  ([see  13]),  e.g.  to  other  steady  state  solutions  or  to  a 
given  function  of  time.  We  give  an  example  of  a  system  (Brusselator)  that  uses  an  alternative  condition 
in  Sec.  6  [13].  We  let 


and  f'{u)  denotes  the  square  matrix  whose  entries  are  (10).  Then  f[u)  =  f'{u)u.  Associated  with  this 
linearized  form,  we  denote  by  (p,  the  generalized  Greens  function  satisfying  the  following  adjoint  prob¬ 
lem: 


-<p-V-  (cV(p)  =  f'{u)^(p, 
^  (p{x,  t)  =  0, 

(p{x,T)-ii/(.x), 

On  subinterval  /„  =  (r„_i,  t„),  we  define  the 


(x.rlenxfT.O], 
(x,  r)  edO  X  {T.O], 

X  £  O, 


eV(p  = 


CiV^i' 

.£'2V^2,  ' 


(11) 


solution  operators  5>„  associated  with  the  Green’s  function, 


(fix,  t)  =0„(t)lpnM, 


for  tn>  t  >  t,t-\  and  some  initial  data  y/„.  To  get  solution  representation  using  the  Green’s  functions, 
we  multiply  u  with  (11),  integrate  in  time  and  space,  resulting  in 


—  (w/i-i,<p/i-i)  -t" 


f  (u-V- 
Ji., 


{eVii)  -  f'{ii)u,(p)dt 


=  (u„_i,</)„_i)-h  I  {u-V-(eVu)- f(.u),(p)dt. 

Ji„ 

Because  u  solves  (2),  this  last  equality  gives 


(12) 


(13) 


4.1.  Analysis  of  the  analytic  iterative  method 

To  simplify  presentation,  we  express  the  analytic  iterative  method  in  Alg.  2  in  a  more  compact  for¬ 
mat.  In  particular,  for  any  iteration  index  m,  we  write  (6)  and  (7)  as 


u'"'>-V.(eVu'"'’)  =  /(u'"”)-K5^"'>, 


(14) 


The  vector  5^"''  can  be  interpreted  as  residuals  at  the  iteration  level  m. 

To  define  an  adjoint  for  the  approximation  in  Alg.  2,  we  let  (pi  denote  the  generalized  Green’s  func¬ 
tion  that  satisfies  an  adjoint  problem  on  time  interval  /„  as  given  in  Alg.  3.  Here  K„  refers  to  the  number 
of  iterations  to  be  used  when  synchronizing  the  two  components  of  the  adjoint. 

Notice  that  the  adjoint  problems  are  solved  backward  in  time  and  in  the  reverse  order  to  that  of  the 
forward  problem,  starting  with  (pz  followed  by  (p\.  These  generalized  Green’s  functions  are  an  iterative 
approximation  of  (11).  We  note  that  the  coefficients  are  linearized  around  u*"''.  As  in  the 

forward  problem,  we  can  also  rewrite  this  last  algorithm  into  a  compact  form 


-  y .  (cy,,<«) = ,  ,17) 

for  adjoint  iteration  level  k.  Here,  is  the  residual  of  the  adjoint  at  iteration  level  k.  We  also  introduce 
the  solution  operators  with  ip^^Hx.t)  =  <i>^^\t)'(p„[x),  for  t„>  t>  r„_i.  To  get  a  representation  of 
the  iterative  solution,  we  follow  a  similar  derivation  for  the  fully  coupled  problem  (see  (12)).  Multiplying 
equation  (17)  by  integrating  each  over  Ox  /„  and  applying  integration  by  parts  in  time,  and  Green’s 
Theorem  in  space  and  using  (14),  we  obtain  the  solution  representation  of  the  analytic  iterative  method 


(18) 
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Algorithm  3  Adjoint  for  the  analytic  iterative  method 

■  Set  (p®  = 

yil.n 

for  /:  =  1  to  Kn  do 

Compute  (p^2^lx,  r)  in  fl  x  (r„,  r„_i],  satisfying 

(x,  r)eQx(r„,r„_,], 

(P2*^’(x,  r)  =  0, 

(x,  r)ednx(r,„r„_ii, 

(15) 

<py(x,tn)  =  y/2.nM, 

xe  n. 

Compute  (p\'‘hx,  r)  in  Q  x  (r„,  r„_il,  satisfying 

(x,  f)enx(r,„r„_i]. 

i 

(p['^\x,t)  =0, 

(x,  f)  e  an  X  (t„, 

(16) 

(pfhx,tn)=yil.nM’ 

xe  Q. 

end  for 

We  note  that  this  representation  is  not  in  the  standard  format  (in  which  the  solution  at  the  current  time 
level  solely  depends  on  the  previous  time  level  values).  It  contains  remnants  arising  from  the  iterative 
procedure  used  to  compute  both  forward  and  backward  problems.  The  second  term  can  be  interpreted 
as  the  weighted  average  of  the  forward  problem  residual  over  a  time  step.  The  third  term,  on  the  other 
hand,  is  the  weighted  average  of  the  backward  problem  residual  over  a  time  step.  Thus,  the  iterative 
nature  of  solution  procedure  is  reflected  in  this  representation.  Once  convergence  is  reached  both  on 
forward  and  backward  problems,  then  the  standard  convention  of  solution  representation  using  the 
adjoint  technique  is  recovered.  ^ 

We  are  now  able  to  express  the  error  representation  of  the  iterative  implicit  method.  Let  = 
-  uj,"'’.  Now,  we  state  a  lemma  concerning  an  error  equation  over  one  time  step. 

Lemma  4.1.  The  analytic  iterative  method  satisfies  the  following  error  equation  over  one  time  step: 

=  («,.  -  «!;'>,  V'n)  =  +  .AcD„t^„) 

where  AO„  =  (0„ 

Proof  Subtracting  (18)  from  (13),  and  adding  and  subtracting  (m|,”!',,0„i//„), 

{elf^\y/n)  =  {Un-u\f'\y/„) 

''^n  '  •'In  ' 

J/n  Jin  ' 


Adding  and  subtracting  *0  above  equation  completes  the  proof. 


□ 
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4.2.  Analysis  of  the  iterative  multi-discretization  Galerkin  finite  element  method 

To  construct  the  adjoint,  let  z'"'*  =  +  (1  -  s)U^"‘\  with  s  e  [0, 1).  Then  let  /'(z<"'')  be  a  matrix 

whose  entries  are 

=  f'  ^(z^"'^)ds. 

^  Jo  duj 

Consequently,  /(«*"'*)  -  =  /'(z*""')  (n*"'*  - 1/*"''). 

Associated  with  the  finite  element  solution,  we  denote  by  d  the  generalized  Green’s  function  that 
satisfies  the  adjoint  problem  in  Alg.  4.  As  was  the  case  in  the  adjoint  formulation  associated  with  ana¬ 


Algorithm  4  Adjoint  for  the  iterative  multi-discretization  Galerkin  finite  element  method 

Set  €)'°’  =  vri,„ 
for  i:  =  1  to  Kn  do 

Compute  d^2^{x,  7)  in  Q  x  (t„,  t„_i|  satisfying 

-  V-  (czVfl'''’)  =  /2'2(z'"'>)a® 

(x,  7)enx(7„,7„_il, 

- 

o 

II 

S/v, 

(x,  7)ednx  (7,„7„_il, 

(19) 

[  ^2\x,tn)  =  ''p2.n(x.), 

xE n.  . 

Compute  7)  in  Q  x  (7„,  7„_i]  satisfying 

1 

(X,  7)  en  X  (7„,7„_il, 

II 

o 

(x,  7)ednx(7„,7„_i), 

(20) 

[  ^f\x,tn)  =  M'i.n^x), 

xe  n. 

end  for 

lytic  iterative  method,  this  algorithm  can  be  expressed  as  a  compact  form 


-  V  •  (eVf)'''’)  =  /'(z<"‘>)  +  T]'*’, 


nf  = 


(21) 


Here,  7]*^'  is  the  residual  of  the  adjoint  at  iteration  level  k. 

At  this  stage,  we  are  in  position  to  derive  an  error  equation  associated  with  the  iterative  multi¬ 
discretization  Galerkin  finite  element  method.  Let  e*"'*  =  - 1/'"'’.  First  notice  that  using  integration 

by  parts, 

(eVu'"”  -  eV =  (cVe‘"'’,  =  (e'"”,  -V  •  (cVf)'*'')) . 

Similarly, 


,(!-) 


Furthermore,  using  continuity  of  u*"'*. 


5(m)  + 


(m)+  _jj[m)+ 

7-1,/? 


=  (« 


{m}- 

l-ln 


_//(/?!)-•)  _=(/n)- 


l-l.n- 
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We  use  these  three  expressions  on  time  interval  Ii  „,  1=  to  obtain 

=  +  f  dt 

h.n  ' 

+  f  [cV[/‘"'>-cVu^"'>,V^)'*"|<ir+  f  dt. 

•'ll.n  '■  ll.n  ' 

Hence, 

+  (eVt/‘"'\Vi9^*^>)dt  + (e<'”'.77‘^*^*)  dt  (22) 

+  f  (-t<‘"'’  +  V-(eVt/‘'”>)  +  /(M‘""),-0‘*^’)  dt 

h.n  ' 

Rearranging  the  terms  in  (22)  and  using  (14)  we  obtain  a  recursive  relation 

-f  |[(/^"'’-/(t/‘"'’).5^*^’]  +  (£:VLt‘"'’,V5‘*'’)]dt  ^23) 

+  f  [s^^'Kd^'^^jdt- f  (e‘"‘’,t/'/^)df. 

This  is  the  basis  for  the  equation  for  the  error  at  time  t„  stated  in  the  following  lemma. 

Lemma  4.2.  The  iterative  multi-discretization  finite  element  solution  satisfies  an  error  equation  over  one 
time  step: 

(gr^l/^.)  =  (4-’r.<i,)  +  Ql..  +  02.«-L  f  (e^'”\q^,^>]dt 

+  E  f  (<5ft(M^"”)-<5«(t/‘"^>),5‘*^1  dt 

1=1-1  Il.n  ^ 


where 


u/rmic 

0i.^=  L{/^ 

l=l  ^  (25) 

Proof.  This  is  obtained  by  using  the  recursive  relation  (23)  and  applying  integration  by  parts.  □ 

We  note  that  this  equation  reflects  the  error  arising  from  the  consistent  finite  element  numerical 
discretization  of  the  analytical  iterative  method.  Similar  to  Lemma  4.1,  this  error  contains  the  iteration 
residuals  weighted  by  the  adjoint  .  The  last  term  cannot  be  approximated  easily  since  it  involves  the 
error  e^"'^  weighted  by  the  iteration  residual  in  the  adjoint  computation.  However,  provided  that  an  a 
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priori  estimate  on  is  available,  we  can  control  this  term  to  be  relatively  small  due  the  fact  that  the 
residual  can  be  made  as  small  as  needed  when  the  adjoint  computation  is  driven  to  convergence. 

We  now  collect  all  the  results  above  and  obtain  an  error  representation  of  the  finite  element  multi¬ 
scale  iterative  implicit  method  by  setting  -  u  -  =  (u-  -  [/*”’’)  =  e*”'*  + 

Theorem  4.1,  Sety/N  =  y/ andy/„-\  in  Alg.  4  and  y>  n-\  =  in  Alg.  3,  for  n  =  N,N  ,\- 
Then  the  error  of  iterative  multi-discretization  finite  element  solution  at  final  time  ti\!  -  T  can  be  ex¬ 
pressed  as 

N  . 

=  L  Qi.«  +  Q2,«  +  03,«  +  Q4,«  +  Q5,«  +  Q6,«  .  (26) 

>1=1  ' 

Qi,,i  and  Q2.n  are  given  in  Lemma  4.2  with  m  =  Mn  and 

Qs.n^ff  dt 

1=1-1  ll.n 

Q^.n  =  i;  f  dt 

Qs.n  =  +  f  dt 

Jilt  ' 

.  Q6,„  =  (4^",’, -  f  dt, 

Proof.  First  we  estimate  the  error  over  one  time  step.  Combining  Lemma  4.2  and  with  Lemma  4.1  we 
get, 

(ef i/^„)  =  +  Qi.n  +  Oz.n  +  Q3,»  +  Q4,,.  +  Q5,„  +  Qe.;..  (27) 

We  note  that  since  and  ’  (see  Alg.  1),  we  have 

^(Mn)  _  jhig  yields  a  recursive  relation  in  terms  of  and  for  the  total  error  over  one 

time  step.  The  error  at  the  final  time  is  obtedned  from  undoing  this  relation  and  assuming  e^““  =  e^°  = 

0.  □ 


The  terms  Q5,»  and  Q6,>i  are  not  easy  to  approximate.  However,  provided  the  discretization  error 
and  the  iteration  error  are  sufficiently  small,  Qs  „  and  Qe.n  are  asymptotically  small  compared  with 

Qi.>i>  •  •  •  I  Qa.ii- 

Theorem  4,2.  The  terms  T.n=i  Qs.n  and  L^)Li  Qe.n  are  asymptotically  small  compared  with  0i,>i.  •  •  • . 
Y.n=i  Qi.n  ia  the  limit  of  iteration  errors  ||  -  u||£,oo(/^.^2(n))  and  ||(p^^7i)  _  (p||£^co(/^.£2(0))  tending  to  zero 

for  all  n. 

Proof  Of  the  two,  Q5,»  is  more  difficult  to  estimate.  Let  ip  be  the  solution  of 


-ip-V ■  (,eV{p)  =  fiu^’^'J)  <p, 
(p(x,  t)  =  0, 
ipix,  t„)  =  y/„(.x), 


(X,  r)£nx(f„,r„_i], 
(x,  r)  e  dQ  X  (r„,  r„_i|, 
X  £  n. 


Notice  that  (28)  and  (11)  differ  only  in  terms  of  linearization  point  for  /'.  Now  we  write 

=  {q}  — ip)  +  (^-ip*^"')  =  a  +  f. 
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where  a  and  ^  satisfy,  respectively 


-d-V-(eVa)  =  /'(u(^'>’)  a  +  5j,(/),  (29) 

=  (30) 

with  zero  initial  and  boundary  conditions,  and  we  designate  the  2x2  matrix  S f  =  /'(n)  -/'(u^^"'). 
Multiplying  (29)  by  a,  and  following  by  integration  over  (r,  r„)  x  Q  yields 

||a(r)||^  <  ||a(r)||^  +  2  f  (eVa,Va)dT 


=  2^^  (a,/'(u^^''l)a) dr +  2 [(p,6f'a)dT 
<2lJ^  llall^dr +  2 J  |(/)| |5y/| |a| dr. 


where  ||  •  ||  is  the  norm  in  L^(Q)  x  L^(Q),  and  |  •  1  is  understood  as  the  usual  Euclidean  vector  norm  for  cp 
and  a,  or  its  corresponding  matrix  norm  for  S p.  There  is  a  constant  C,^  <  oo  such  that  \\(pi  llL”((,(„,z,2(n))  < 
C^,  see  for  example  [20].  We  apply  the  Cauchy-Schwarz  and  arithmetic-geometric  mean  inequalities  to 
the  last  term  on  the  right  hand  side  of  (31)  to  get 


l|a(f)ll‘ 


<2Lj^  ||a||^dT  +  j^  ||5ydl^dT  +  C^  llall^dr 
=  f‘"  \\5f'f  dT  +  (2L  +  C^)J^‘"  ||a||2  dT. 


Gronwall’s  inequality  then  implies 


Similarly,  we  get 


l|a(f)||2<exp((2L+c2)(f„-f))  j^'''||5y,||^dT. 
||)S(r)||2<exp((2L)(r„-  r)) 


Next,  we  multiply  to  (29)  and  (30),  respectively,  integrate  each  of  them  over  /„,  and  apply  integra¬ 
tions  by  parts,  and  use  (14)  to  get 


(u|/^f,a„_i)- dt  =  (a,5^^"hdt, 

'’In  ' 


and  thus 


Q5.n  =  [u^^’"\sj,<p]dt  +  j^  (a,5^^"’)dt  + [p,5^^''^)dt 

<f  {5pu^^"\(p]dt  +  ^  f  (Haf  +  npf)dt+^  f 

J/„  2J/„  2  Ji„ 

~  h  exp(C(r„- r))  (Il5y-|l^-t- ll^fl^"’ll^)dTdr  + i  115^^"’ 
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Notice  that  the  second  term  and  third  terms  in  (36)  involve  integration  of  the  square  of  residuals.  Thus 
these  terms  are  asymptotically  small  compared  to  Qj,rt<  j  -  I,--  -  A  as  converges  to  u.  Moreover, 
the  first  integral  in  (36)  dominates  the  other  terms.  We  show  this  term  is  asymptotically  small  compared 
to  (Js.n  as  converges  to  u. 

First  we  bound  the  term  Qs^n-  From  assumption  A.l  we  have, 

<  I X  f  II lllisf"’ II  dt. 


^l,«  n 


^\.n  p 


-  Ilf  "\\dt  + 


E  f  ”l|dt+ E  f  I|i4^''  *’-t/2^"~^’||dt.  (38) 

/=lA,  l=\Jll.n 


Now,  the  first  and  the  third  terms  are  discretization  errors,  and  depend  on  the  order  of  the  numerical 
method,  say  p(Af,  h)  for  some  homogeneous  function  p.  From  (68)  in  the  proof  ofTheorem  9.1,  \\uf"^- 
j4^''“^*||  =  0{r^"),  for  some  t  <  1,  and 


lluf'-'-uf- 


’||dt  =  0(T^'''"‘). 


Combining  this  with  (37)  and  (38),  we  have. 


(53.n  =  0(p(At,/z)  +  T^'’+‘).  (40) 

We  now  return  to  estimate  the  first  term  in  (36).  Noting  that  /'(m1^"')u*^''*  =  /(u*^"')  and  by  the 
assumption  that  /'  is  Lipschitz  continuous  with  constant  L',  we  have, 

f  (f'(u)u^^"^-f(u^"1,(p]dt=f  ((f'(u)-f'{uf^"^)}u^^'’K(p]dt<f  L'IIw- j/^"'||||0||dt.  (41) 

Jin  '■  Jin  '  Jin 

An  analysis  of  the  semigroup  associated  with  the  problem  similar  to  that  used  in  the  Appendix  to  derive 
(68)  yields  ||m-  u^^"’|I  =  0(r^'''‘'M-  Combining  this  with  (41)  and  using  appropriate  scaling  we  have, 

f  (/'(Ulu^'^"’  -/(u'^"),(p]  dt  =  OfT^"'"^).  (42) 

Jin  ' 

Hence,  ^  is  asymptotically  smaller  than  Q3,„  as  converges  to  u. 

Turning  to  Q6,n>  we  note  that  it  is  a  sum  of  two  terms.  The  first  term  is  a  product  of  iteration  errors 
for  the  forward  and  adjoint  problems,  and  is  straightforward  to  bound  as  smaller  than  Qj  n,  J  =!,■■■  ,4 
as  the  iterations  converge.  The  second  term  in  Qj_„,  y  =  I,-  -  ,4  is  a  product  of  discretization  error  and 
iteration  residual  in  the  adjoint.  This  is  bounded  smaller  than  Qj,n,  j  -  l.o,4  by  an  argument  similar  to 
that  used  for  analogous  expressions  in  (js.n. 

□ 
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4.3.  A  computational  error  estimate 

The  error  representation  in  Theorem  4.1  contains  terms  involving  the  true  continuum  solution 
as  well  as  the  true  adjoint  solutions  and  We  form  a  computational  error  estimate  by  approx¬ 
imating  the  adjoint  solutions,  and  in  a  finite  dimensional  space.  These  adjoint  problems 

are  approximated  by  substituting  the  finite  element  solution  for  as  is  common  in  adjoint 

based  error  estimation  literature.  Further,  the  term  Q4,,,  is  expressed  as. 

Here  the  term  is  a  product  of  difference  of  two  residuals,  and  hence 

we  drop  it  in  the  computational  error  estimate.  This  leads  to  the  following  computational  error  esti¬ 
mate. 

Theorem  4.3.  The  error  of  the  iterative  multi-discretization  finite  element  solution  at  final  time  ti^/ =  T 
can  be  approximated  as, 

N 

=  Z  (QM  +  Q2,n  +  Q3,n  +  Q4,4  (44) 


where, 


02.n  =  f  { [(- +/2 ((/{””, yf ’),4^"’’'') - (ezvyf 

Qs.n  =  E  f  dt 


^In  r  . 

Q3.n  =  E  dt 

;=i  Jii.n 

Q^.n  =  E  f  dt 

1=1 -Ikn  ''  ' 


We  present  interpretations  of  the  computational  error  contributions  in  Table  1.  Note  that  we  have 
Notation  Contribution 

Qi  .  Discretization  error  in  component  Ui 

Q2  Discretization  error  in  component  U2 

Q3  Iteration  error  for  the  numerical  solution 

Qn  Error  due  to  linearization  in  the  computed  adjoint  problem 

Table  1:  Error  contributions  and  their  interpretations 

dropped  Q5,/,  and  Qe./j  to  obtain  (44).  As  explained,  this  is  reasonable  provided  the  iteration  has  con¬ 
verged  to  a  sufficient  degree  and  the  discretization  is  sufficiendy  refined.  The  examples  below  demon¬ 
strate  the  estimate  (44)  provides  a  reasonably  accurate  approximation  of  the  true  error. 
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Remark  4.1.  We  note  that  computing  the  error  estimate  (44)  involves  the  cost  of  solving  the  adjoint  prob¬ 
lem  in  addition  to  computing  the  original  approximation.  The  computational  cost  depends  on  how  the 
numerical  adjoint  problem  is  solved,  however  the  adjoint  problem  is  at  least  linear,  and  hence  often  in¬ 
volves  less  iteration  than  solving  the  original  problem. 

On  this  issue,  it  is  important  to  note  that  if  the  practical  application  requires  an  accurate  error  esti¬ 
mate  to  accompany  a  numerical  solution,  then  the  issue  of  cost  of  the  error  estimate  has  to  be  related  to 
the  cost  of  alternative  approaches  to  error  estimation.  There  are  other  ways  to  treat  numerical  solutions 
of  coupled  systems  involving  iteration,  e.g.  [18,  17,  16,  7J.  Some  of  these  approaches  provide  for  direct 
estimation  of  the  effect  of  finite  iteration  on  accuracy,  at  the  cost  of  greatly  increasing  the  number  of  ad¬ 
joint  problems  that  must  be  solved.  The  estimate  in  Theorem  4.1  is  thus  relatively  inexpensive  at  the  cost 
of  assuming  that  the  iteration  has  converged  to  a  sufficient  degree. 

Remark  4.2.  Standard  adaptive  error  control  strategies  based  on  the  Principle  of  Equidistribution  applied 
to  "dual- weighted"  a  posteriori  estimates,  [8,  9,  5,  19,  3],  can  be  extended  to  (44)  in  a  straightforward 
way  to  balance  all  sources  of  error.  For  example,  if  the  component  Q\  is  large,  then  refining  the  spatial 
and  temporal  mesh  for  the  first  component  may  lead  to  a  more  accurate  solutions.  A  similar  conclusion 
follows  for  Q2.  The  terms  Q3  and  Q4  reflect  errors  incurred  due  to  finite  iterations,  and  these  errors  may 
be  reduced  by  increasing  the  number  of  iterations.  However,  we  note  that  many  application  codes  for 
multi-physics  problems  eschew  adaptive  computation. 


5.  Numerical  experiments  using  equal  spatial  meshes 

In  this  section,  we  present  numerical  examples  to  illustrate  the  performance  of  the  error  estimates. 
For  various  problems,  we  show  plots  of  the  error  estimate  and  true  error  accompanied  by  plots  of  the  in¬ 
dividual  contributions  to  the  error  estimate,  Qi.ntQi.n'Qs.n'Qi.n  as  defined  in  Lemma  4.2  and  Theorem 
4.3.  A  comparison  of  the  relative  sizes  of  the  different  contributions  to  the  error  is  often  illuminating. 

All  forward  problems  are  solved  using  continuous  piecewise  linear  functions  in  space  and  using 
the  piecewise  constant  discontinuous  Galerkin  method  in  time.  The  piecewise  constant  discontinuous 
Galerkin  method,  or  dG(0),  is  equivalent  to  the  backward  Euler  scheme.  The  nonlinear  equations  are 
solved  using  Newton’s  Method.  The  adjoint  solutions  are  approximated  using  continuous  piecewise 
quadratic  functions  in  space  and  piecewise  linear  continuous  Galerkin  method  in  time.  The  piece- 
wise  linear  continuous  Galerkin  method,  or  cG(l),  is  equivalent  to  the  second  order  Crank-Nicholson 
scheme.  All  problems  are  posed  on  the  unit  square,  i.e.,  on  Q  =  [0, 1]  x  [0, 1]  and  solved  using  a  uni¬ 
form  mesh  containing  (20  x  20  x  2)  triangular  elements.  The  initial  conditions  at  time  t  =  0  are  u  = 
(sin(;rjci)sin(7rjC2),  sin(7rjci)sin(;rx2))^. 

The  quantity  of  interest  in  all  cases  is  given  by  the  globally  supported  function  y/  =  (sin(7rxi)sin(7rjC2), 
sin(7rxi)sin(;rjC2))^.  We  compare  the  performance  of  estimators  using  either  the  analytical  solution 
when  available.  Otherwise  we  use  a  “reference  solution"  using  a  higher  order  spatial  discretization  and 
a  finer  time  step.  In  our  numerical  results,  we  plot  the  different  error  components  and  tabulate  the 
effectivity  ratio  of  the  estimator.  The  effectivity  ratio  is  defined  as  the  ratio  of  the  estimated  error  to  the 
true  error  in  the  quantity  of  interest,  provided  the  true  error  is  not  zero.  An  accurate  error  estimator  has 
effectivity  ratio  close  to  one. 

5.1.  An  equal  rate  one-way  coupled  linear  system 

We  consider  the  system, 

I  —  Ami  =n^u\, 

\u2- Au2  =  n^(0.5uz  + u\). 
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Notice  that  this  is  a  one-way  coupled  system  in  which  the  variable  of  subsystem  1,  u\,  is  coupled  to 
the  variable  of  subsystem  2,  but  u\  can  be  solved  independently  of  112.  The  exact  solution  is  u\  - 
e~"  'sin(n^xi)sin(n^x:2)  and  112  =  2e~’^  ‘sin(7rai:i)sin(rtx:2),  hence  there  is  not  a  significant  difference  in 
spatial  or  temporal  scales.  Since  the  system  is  only  coupled  in  one  direction  there  is  no  need  to  iterate 
to  solve  the  system  and  there  is  no  iteration  error,  i.e.,  Q3  =  0.  Moreover,  for  linearly  coupled  systems 
(P  =  d,  and  hence  Q4  =  0.  The  system  is  solved  until  T  =  0.2  with  At  =  Asi  =  As2  =  0.02.  The  error 
estimate  was  -0.0177161,  as  compared  to  the  true  error  of  -0.0169774  for  an  effectivity  ratio  of  1.04. 


5.2.  A  multirate  coupled  linear  system 


We  consider  the  system. 


{ill  -  Ami  =  -lOOOui  +  U2 
U2-AU2  =  999ui-2m2. 


(45) 


Here,  tti  is  a  fast  variable  and  U2  is  a  slow  variable.  We  solve  until  T  =  0.2  and  plot  the  error  components 
as  a  function  of  Asi  in  Fig.  3(a)  while  fixing  At  =  As2  =  0.4.  We  use  two  iterations  at  each  of  the  time 
steps  At.  As  expected,  the  error  in  the  component  Qi  decreases  as  Asi  is  reduced.  In  Fig.  3(b)  we  plot 
the  effect  of  employing  different  number  of  iterations  to  solve  the  system  at  each  time  step  At.  In  this 
case,  we  use  At  =  0.04,  Asi  =  Af/32  and  As2  =  At/16.  The  iteration  error  decreases  as  the  number  of 
iterations  is  increased.  Except  in  the  extreme  case  of  just  one  iteration,  the  contribution  to  the  error 
from  iteration  is  relatively  small.  In  all  cases,  the  error  estimator  provided  an  accurate  prediction  of  the 
exact  error.  We  recall  that  for  linear  systems,  0  =  5,  and  hence  Q4  =  0.  The  accuracy  of  the  estimator  is 
also  illustrated  in  Tables  2,  which  show  effectivity  ratios  close  to  the  ideal  value  of  1.0. 
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Figure  3;  Example  5.2:  T  =  0.2.  (a)  Af  =  As2  =  0.4,  M  =  2.  Error  contributions  as  Asi  is  varied,  (b) 
A r  =  0.04,  Asi  =  Ar/32,  As2  =  Ar/16.  Error  contributions  as  M  is  varied. 


5.3.  A  coupled  nonlinear  system  using  different  time  steps 


We  consider  the  system. 


{ill  —  Ai<i  -  u\  +  U2, 

U2  —  AU2  -  U\  — 


(46) 


The  system  is  solved  until  T  =  0.2,  with  Af  =  0.04,  Asi  =  Af/16  and  As2  =  At  12.  In  Fig.  4  the  result  of 
increasing  the  number  of  iterations  is  demonstrated.  The  component  Q3  is  initially  large,  but  decays  to 
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Asi 

Effectivity  Ratio 

M 

Effectivity  Ratio 

0.4 

1.12 

1 

1.06 

0.1 

1.11 

2 

1.09 

1.12 

3 

1.09 

0.025 

1.12 

4 

1.09 

(a) 

(b) 

Table  2:  Effectivity  Ratios  for  the  experiment  in  Fig.  3.  a)  Effectivity  Ratios  as  Asi  is  varied,  (b)  Effectivity 
Ratios  as  M  is  varied. 


a  small  value  after  two  iterations.  The  component  O4  is  nonzero  for  this  problem,  since  the  adjoints  5 
and  (p  differ  from  one  another.  However,  it  is  quite  small  compared  to  other  components.  Again,  we 
obtained  very  accurate  error  estimates.  Once  again,  the  effectivity  ratios,  shown  in  Table  3  are  close  to 
the  ideal  value  of  1.0. 
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Figure  4:  Example  5.3:  T  =  0.2,  Af  =  0.04,  Asi  =  Af/16,  A52  =  Af/2.  Error  contributions  as  the  number  of 
iterations  M  is  varied. 


M 

Effectivity  Ratio 

1 

1.08 

2 

1.09 

3 

1.09 

Table  3:  Effectivity  Ratios  for  the  experiment  in  Fig.  4. 


5.4.  A  coupled  nonlinear  system  using  equal  time  steps 
We  consider  the  system. 


{iii-AMi=  exp(«i) +exp(M2) -2, 
U2-AU2  =  -exp(«i)  -exp(M2)  +2. 


(47) 


The  system  is  solved  until  T  =  0.2,  wdth  Af  =  0.01,  Asi  =  As2  =  Af/2.  The  effect  of  increasing  the  number 
of  iterations  is  shown  in  Fig.  5.  The  component  Q3  is  large  after  just  one  iteration,  but  contributes 
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relatively  little  after  tvvo  iterations.  The  component  Q4  is  nonzero  for  this  problem  since  the  adjoints 
d  and  (p  differ  from  one  another.  The  effectivity  ratios  for  this  experiment  are  shown  in  Table  4.  The 
effectivity  ratios  are  quite  close  to  1.0,  indicating  the  accuracy  of  our  estimator. 


Number  of  iterations,  M 


Number  of  iterations,  M 


(a)  (b) 

Figure  5:  Example  5.4:  T  =  0.2,  Af  =  0.01,  Asi  =  As2  =  At/2.  Error  contributions  as  the  number  of  itera¬ 
tions  M  is  varied,  (a)  True  and  estimated  errors,  (b)  Q3  and  Q4  only. 


M 

Effectivity  Ratio 

1 

1.10 

2 

1.09 

3 

1.09 

Table  4:  Effectivity  Ratios  for  the  experiment  in  Fig.  5  (a). 


6.  A  posteriori  analysis  of  the  iterative  multi-discretization  Galerkin  finite  element  method  for  dif¬ 
ferent  spatial  meshes 

In  this  section,  we  derive  an  estimate  for  the  case  in  which  the  two  subsystems  in  Alg.  2  are  solved 
on  different  space  meshes.  For  such  systems,  we  can  further  decompose  the  error  components  to  re¬ 
flect  the  projection  errors.  Solution  of  (4)  involves  the  projection  of  denoted  as 

from  to  If  the  number  of  time  steps  are  the  same  for  the  two  subsystems,  then  112- 1  is  the 
projection  of  functions  from  the  mesh  for  subsystem  1  to  functions  on  the  mesh  for  subsystem  2.  Sim¬ 
ilarly,  solution  to  (5)  involves  the  projection,  01-2^/}”''.  of  on  the  space  of  functions  on  the  mesh 
of  subsystem  2.  With  these  projections  we  have  the  following  error  representation. 

Theorem  6.1.  Sef^/  ^  =  ^1/  and\f/n-i  =  ^  n~i  f^r  n  =  N,N  Then,  with  Assumptions  A.  1  and  A.2, 

the  error  of  iterative  multi-discretization  finite  element  solution  at  final  time  fjv  =  T  can  be  expressed  as 

^Qlb,;i  +  Qlc.n  +  Q2b.n  +  Q2c,n+Q3,n  +  Q4,n  +  Q5,n  +  Q6.nj,  (48) 
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where  Qs.^,  QA.n>  Qs.w  Qe.n  ^  given  in  Theorem  4.3  and 

Qib.n  =  E  { /^  [( - +  /i n2-, ’)  -  (e, vt/}"'\ vflf ))] dr 

Qib.n^tll  [(-yf'’  +  /2{ni_2^‘”^t/fh,^f)-(c2VLr{"‘\v5f>)]dr 
+  ((t/2‘"'>l,-,,n.53_,,J}. 

Q2c.n  =  t/'"'')  -/2(n,_2t/}'">,t/f' ’),5f ). 

Proof.  Adding  and  subtracting  |/i(t/{'”’,n2— 

Qi."  =  L  { I  [  +/i(t/,‘'"’,n2-it/f' -'>),<')  -  (ciVt/}"”, V5<«)  ]dr 

~  Qlb.n  Qlc.;? 

Similarly,  adding  and  subtracting  leads  to, 

Q2,«  =  Q2b.n  +  Q2c,h 

Combining  these  with  (26)  leads  to  (48).  □ 

For  simplicity  in  our  examples,  one  mesh  will  always  be  a  refinement  of  the  other  mesh.  Nodal 
projection  for  the  space  meshes  is  employed  for  the  operators  ni_2  and  n2_i.  Further,  we  form  a 
computational  error  estimate  in  the  manner  outlined  in  Section  4.3,  representing  the  approximations 
of  the  terms  Q/,,;  as  We  recall  Table  1  that  describes  the  contributions  to  the  error. 

6.1.  A  Linear  System 

In  this  section,  we  consider  the  system, 

ill  -  Aui  =  Ui, 

U2  -  AU2  =  b-  Vui  +  U2, 

ui  =  5x?(l  -  l)xf(l  -X2)^(e"’’^  -  1). 

U2  =  sin(;rxi)  sin(;rx2), 

where  b  =  (1000,  1000)^.  The  quantity  of  interest  is  taken  to  be  t/'  =  (0,  lOOx^d  -xi)^x|(l  -X2)^)^.  Note 
that  due  to  the  presence  of  the  term  b  - Vui,  the  term  /2i(m^'”’)02*^'  in  Alg.  3  is  interpreted  as  -V-ibc/)^*^’). 
The  term  /2',(z^'''')52*^'  in  Alg.  4  is  treated  in  a  similar  fashion. 

In  the  numerical  experiments,  subsystem  1  is  solved  on  a  uniform  mesh  comprising  (40  x  40  x  2) 
triangular  elements.  The  mesh  for  subsystem  2  is  varied  through  (5  x  5  x  2),  (10  x  lO  x  2),  (20  x  20  x  2)  and 


(x,  t)  en  X  (0,  T], 
(x, t)  en  X  (0, T], 
(x,  r)  E  n  X  {0}, 

(x,  r)  e  n  X  (0), 
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(40  X  40  X  2)  triangular  elements  and  the  system  is  solved  with  At  =  Asi  =  As2  =  0.01.  We  plot  the  error 
components  as  a  function  of  the  ratio  of  mesh  sizes  in  Fig.  6.  The  figure  indicates  that  the  projection 
error  Q2c  dominates  the  total  error  when  there  is  a  large  difference  between  the  mesh  sizes,  and  goes 
to  0  as  the  two  meshes  have  the  same  size. 
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Figure  6:  Example  6.1:  T  =  0.2,  Ar  =  Asi  =  As2  =  0.01.  Error  contributions  versus  ratio  of  mesh  sizes. 


6.2.  The  Brusselator 

We  recall  the  Brusselator  problem  (1)  in  Sec.  1.  The  values  of  different  parameters  are  the  same  as 
in  Section  1.  The  system  as  posed  does  not  satisfy  /(u)  =  0.  However,  we  use  a  change  of  variable  to 
accomplish  this.  The  new  variables  are  defined  as, 


[u\-  ill- a 
]  U2  =  U2-pia 


With  these  new  variables,  u  =  (Ui,  U2)^,  the  new  set  of  equations  satisfy  the  requirement  that  /(u)  =  0. 
We  experiment  with  two  different  qutmtities  of  interest;  a  spatial  quantity  of  interest  at  the  final  time, 
and  a  time  based  quantity  of  interest  approximating  the  temporal  derivative  at  a  certain  time. 


6.2.1.  A  spatial  quantity  of  interest  at  the  final  time 

For  this  experiment,  we  take  the  quantity  of  interest  to  be 


Xj  (1  -  Xi)^(exp(6Xj)  -  Dx^d  -  X2)^(exp(6x|)  -  1)1 
0  ]■ 


evaluated  at  final  time  T  =  0.7.  This  quantity  of  interest  is  adapted  from  Ch.  8  in  [2].  Mesh  1  and  mesh 
2  were  chosen  to  be  uniform  with  (40  x  40  x  2)  and  (20  x  20  x  2)  triangular  elements  respectively.  At  = 
As2  =  0.001  and  M  =  2.  In  Fig.  7(a)  the  effect  of  decreasing  Asi  on  the  error  components  is  evident  and 
the  error  component  Qn,  decreases  as  Asi  is  reduced  as  expected.  Note  that  the  total  error  increases 
as  Asi  is  reduced,  due  to  cancellation  of  errors  with  opposite  sign.  In  Fig.  7(b)  the  effect  of  varying  the 
number  of  iterations  M  is  shown.  The  rest  of  the  parameters  are  the  same  as  for  Fig.  7(a)  except  that 
Asi  is  fixed  at  0.001.  For  M  =  1  there  are  significtint  errors  in  the  components  Q3  and  Q4,  but  these 
errors  decrease  as  the  number  of  iterations  is  increased. 
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(a)  (b) 

Figure  7:  Brusselator:  T  =  0.7,  At  =  AS2  =  0.001.  (a)  M  =  2.  Error  contributions  as  Asi  is  varied,  (b) 
Aji  =  0.001.  Error  contributions  as  M  is  varied. 


6.2.2.  Competing  effects  of  discretization  and  projection 

Separate  refinement  of  either  of  the  spatial  meshes  may  result  in  a  reduction  of  discretization  errors 
for  the  solution  component(s)  computed  on  that  mesh,  hut  may  also  increase  projection  errors.  For 
this  experiment,  mesh  2  was  held  fixed  with  (20  x  20  x  2)  triangular  elements  while  mesh  1  was  varied 
having  (20  x  20  x  2),  (40  x  40  x  2)  and  (80  x  80  x  2)  triangular  elements.  Here  At  =  As\  =  AS2  =  0.001 
and  M  =  2.  In  Fig.  8(a)  the  error  components  are  plotted  for  this  series  of  different  discretization  levels 
for  mesh  1.  Note  that  discretization  error  Qib  decreased  as  the  mesh  ratio  decreased  (as  mesh  1  was 
refined),  but  that  the  projection  error  Qzc  increased..  While  the  magnitude  of  reduction  of  Qib  exceeded 
the  magnitude  of  the  increase  in  Qzc,  the  total  error  increased  as  mesh  1  was  refined  due  to  cancellation 
of  errors  with  opposite  sign. 

In  Table  5  we  tabulate  the  error  contributions  for  three  different  choices  of  mesh  1,  two  uniform 
and  one  non-uniform.  Fig.  8(b)  where  the  mesh  is  refined  in  regions  of  rapid  variation  of  component 
Ml.  Mesh  2  was  uniform  with  (20  x  20  x  2)  triangular  elements  for  all  three  cases.  Here  Ar  =  Asz  = 
0.001,  Asi  =  At/4  =  0.00025,  and  M  =  2. 

When  mesh  1  has  (20  x  20  x  2)  uniform  triangular  elements,  the  first  row  of  Table  5  indicates  that  the 
dominant  error  contribution  is  Qib,  the  discretization  error  on  mesh  1.  Halving  each  element  on  mesh 
1  produces  a  situation  in  which  the  discretization  errors  on  both  meshes  and  the  projection  error  are 
roughly  of  the  same  magnitude  (row  2  of  Table  5).  Non-uniform  refinement  of  mesh  1  such  that  it  has  a 
finer  mesh  in  regions  of  sharp  variation  produces  a  similar  distribution  of  error  with  2/3  of  the  number 
of  elements  (row  3  of  Table  5). 


Mesh 

Elements 

Dofi 

Dof2 

Estimate 

Qlt) 

Qib 

Qzc 

Q3 

Q4 

Coarse  uniform 

800 

441 

441 

-0.1509 

-0.2139 

0.0618 

0.0000 

0.0003 

0.0008 

Fine  uniform 

3200 

1681 

441 

0.0644 

-0.0358 

0.0377 

0.0614 

0.0002 

0.0007 

Non-uniform 

2320 

1241 

441 

0.0600 

-0.0511 

0.0398 

0.0704 

0.0003 

0.0007 

Table  5:  Brusselator:  T  =  0.7,  At  =  Asz  =  0.001,  Asi  =  At 14  =  0.00025,  M  =  2.  Error  components  for  two 
uniform  and  one  non-uniform  mesh  1.  Here  Dof/  refers  to  degrees-of-freedom  for  the  component  m,-. 
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Figure  8:  Brusselator:  T  =  0.7,  At  =  Asi  =  As2  =  0.001,  M  =  2.  (a)  Error  contributions  versus  ratio  of 
mesh  sizes,  (b)  Refined  mesh  used  to  produce  error  contributions  provided  in  row  3  of  Table  5. 


6.2.3.  A  temporal  derivative  as  the  quantity  of  interest 

For  this  experiment,  we  approximate  the  time  derivative  ^  Jq  u\  dx  of  the  average  value  of  U\  at 
some  r  =  using  a  central  difference.  We  approximate  the  temporal  derivative  of  a  function  v  by. 


dv  v{tD  +  0.5At)-  v{tD-Q.5At) 

—  - .  (50) 

dt  t=to 

In  practice  we  approximate  the  point  value  using  a  local  average.  That  is,  v{r)  =  v{r)  =  v{t)dt. 
As  r  —  0,  v{r)  —  v(.t).  The  adjoint  solution  required  a  finer  (time)  discretization  near  tp  to  accurately 
resolve  the  adjoint  solution.  Near  t  =  to,  we  used  a  time  discretization  that  was  100  times  finer  than 
that  used  for  the  forward  problem.  That  is,  in  this  region  the  time  step  is  Af/100,  where  Af  is  the  time 
step  for  the  forward  problem.  Moreover,  we  chose  r  =  Af/10. 

In  Fig.  9(a)  we  investigate  the  effect  of  the  number  of  iterations.  We  use  At  =  AS2  =  0.001,  Asi  = 
0.0005  and  the  same  uniform  mesh  with  (40  x  40  x  2)  triangular  elements  for  both  components.  For 
M  =  1,  the  estimate  is  dominated  by  the  term  Q3  and  then  by  Q4,  which  measure  the  effect  of  the 
number  of  iterations. 

In  Fig.  9(b)  we  show  the  effect  of  varying  Asi  for  fixed  M.  We  use  At  =  Asz  =  0.001,  M  =  3  and  the 
same  uniform  mesh  with  (40  x  40  x  2)  triangular  elements  for  both  components.  We  see  that  the  error 
in  the  component  Q\b  decreases  as  Asi  is  reduced,  as  expected.  The  component  Qib  also  dominates 
the  total  error,  so  refining  the  time  steps  for  this  fast  component  leads  to  significant  reduction  of  total 
error  as  well. 


7.  A  posteriori  analysis  for  systems  with  nonlinear  diffusion  coefficients 


To  explain  how  the  a  posteriori  analysis  can  be  extended  to  fully  nonlinear  coupled  systems,  we  pro¬ 
vide  a  formal  derivation  of  an  a  posteriori  error  estimates  for  systems  of  parabolic  initial  boundary  value 
problems  having  nonlinear  diffusion  coefficients.  We  consider  the  problem  of  finding  u  =  [ui,  ^2)^  for 
systems  in  which  the  diffusion  coefficient,  e  =  c(m)  is  a  function  of  u, 


iii-V- [ei{ui,U2)Vui]  =  fiiui.uz), 
U2-V-(e2(Ul,U2)VU2)=f2(Ul,U2), 
Ui(x,t)=0, 
iii{x,0'l  =  giix), 


(x,  r)  e  n  X  (0,  T], 

[x,  r)enx  (0,  n, 

(X,  f)ednx(0,  r],i  =  l,2 

xe  n,i  =  1,2, 


(51) 
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Figure  9;  Brusselator:  Time  derivative  at  to  =  0.7  (final  time  is  T  =  0.8).  (a)  Effect  as  the  number  of 
iterations  M  varies,  (b)  Effect  as  Asi  varies  given  M  =  3. 


or  in  a  compact  form, 

It- V(C(M)Vti)  =  fill), 

where  c(i<)  =  diag(C)(ti)  £2(11)).  Meaningful  analysis  of  general  parabolic  systems  with  nonlinear  dif¬ 
fusion  coefficients  is  very  challenging,  see  for  example  [20].  Generally,  analytic  results  can  be  greatly 
improved  by  employing  the  special  properties  of  particular  systems.  We  assume  that  the  a  priori  analy¬ 
sis  is  in  place  and  proceed  to  focus  on  the  a  posteriori  analysis. 

We  again  use  Alg.  1  to  solve  this  system,  with  the  obvious  modification  that  e  =  eiu)  and  for  simplic¬ 
ity  we  consider  a  scenario  of  having  the  same  spatial  discretization  for  iii  and  iiz-  The  adjoint  problem 
similar  to  (11)  is  modified  to  account  for  the  dependence  of  e  on  u, 

-ip-V  ■  ieiu)Vip)+e'{u)^  -Vip  =  f'iu)'^  ip,  (52) 


where  e'(ii)  denotes  a  square  matrix  and  eiu)  is  a  diagonal  matrix  whose  entries,  respectively,  are 

e;i(it)=ff  T^isiOVlUiSldsl  and  e/(«)  =  f  eiius)ds.  (53) 

Uo  duj  I  Jo 

In  compact  form,  the  adjoint  problems  similar  to  (17)  is  also  modified  as 

-  V  ■  (£(ti("'>)V(/?‘*'’)  +  e'(M‘"'')^  •  f (54) 

where  is  as  defined  in  (17),  e'iif'^'^)  and  eiu^"’i)  are  similarly  defined  as  in  (53),  and 

=  [0 


Similarly,  the  finite  element  adjoint  problem  (21)  is  modified  as, 

_  V .  (e(z("‘)) ve'*"’)  +  c'(z("'>)^  •  (55) 

where  z'"'*  =  sii*"’*  +  (l-s)(/^"'*,  c'(z('">)  is  a  square  matrix  and  c(zi"’>)  is  a  diagonal  matrix  whose  entries, 
respectively,  are 

£•'  (zt"*))  =  f  -^(z,-*"'*)  Vz*"'’d5  and  c/(z("’>)  =  f  e,(z*'"*)ds.  (56) 

Jo  ouj  Jo 
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The  residual  is  as  defined  in  (21)  and 

T7‘^’  =  [o 

Analysis  of  this  system  leads  to  the  following  error  representation. 

Theorem  7.1.  Set  xf/M  =  f  and  y/n-i  =  for  n  =  N,N -  I,---  ,1.  Then  the  error  of  iterative  multi¬ 
discretization  finite  element  solution  of  (51)  at  final  time  tN=T  can  be  expressed  as 

N . .  ^  ^ 

,y/)=Y^  (Ql.n  +  Q2,/i +  Q3,n  +  Q4,n  +  Osfo.H  +  Osfo.n  +  Qsc.n  +  Osc,;!].  (57) 

H=1  ' 

where  Qi,n,  Q2,n.  Qa.n.  Q4,n  ore  as  given  in  Theorem  4.3,  and 

Qsb,,  =  + 1  dt 

QebM  =  Acl>„r^„)  -ff  +r?‘o"’)  dt 

Qsc.n  =  E  f  ('5ei {f/''^">)Vf/{'^">,  dt 

Qec.n  =  L  ((5ei («<"»>) Vu'""’,  dt. 

with  (5ei(u^”‘')  u^”’’)  -ei(u,”“*,U2"'~*’),  and  similarly  /or  5ei  ((/*”''). 

A  proof  similar  to  that  of  Lemma  4.2  Is  beyond  the  scope  of  this  paper.  With  the  appropriate  a 
priori  analysis  in  place,  we  expect  that  the  terms  Qs/b.n-Qsc.n.Qeb.n  and  Qec.n  are  small  compared  to 
Ql.n  ■  ■  ■  Q4,«- 


8.  Conclusions 

In  this  paper  we  formulate  and  analyze  an  iterative  multi-discretization  Galerkin  finite  element 
method  for  multi-scale  reaction-diffusion  equations.  Subsystems  in  such  reaction-diffusion  equations 
may  exhibit  significantly  different  spatial  and  temporal  scales,  motivating  a  multi-discretization  nu¬ 
merical  method.  We  employ  adjoint  operators  and  variational  analysis  to  form  computational  error  es¬ 
timates  for  a  quantity  of  interest  calculated  from  the  multi-discretization  finite  element  method.  A  key 
insight  in  analyzing  the  multi-discretization  method  is  the  realization  that  the  adjoint  operator  associ¬ 
ated  with  the  iterative  multi-discretization  approximation  is  different  from  that  of  the  original  problem. 
Hence,  our  analysis  utilizes  two  adjoint  operators.  One  of  the  operators  utilizes  a  different  linearization 
than  the  one  commonly  used  for  nonlinear  problems.  The  other  adjoint  is  based  on  the  property  that 
our  iterative  multi-discretization  Galerkin  finite  element  method  is  a  consistent  discretization  of  the 
analytic  iterative  method. 

We  derive  a  posteriori  error  estimates  to  quantify  various  sources  of  error  in  a  quantity  of  interest 
computed  from  our  iterative  finite  element  method.  We  first  derive  estimates  for  the  case  when  the 
different  components  of  the  system  are  solved  on  the  same  spatial  mesh,  and  then  extend  the  analysis 
to  include  distinct  meshes.  The  error  estimator  has  terms  indicating  errors  arising  from  discretization 
of  each  component,  finite  iteration,  differences  between  the  two  different  adjoints  and  projection.  We 
demonstrate  the  accuracy  of  our  method  through  a  variety  of  numerical  examples,  starting  from  simple 
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linear  problems  and  ending  with  the  non-linear  multi-scale  Brusselator  problem.  We  demonstrate  how 
refining  one  or  both  meshes  or  increasing  the  number  of  iterations  can  decrease  the  specific  error  com¬ 
ponents  arising  from  a  specific  source.  Hence  our  error  estimates  are  useful  not  only  for  computing  the 
total  error  in  a  quantity  of  interest,  but  also  applicable  in  guiding  an  adaptive  refinement  strategy. 


9.  Appendix 

We  prove  the  convergence  of  the  iterative  scheme  in  Alg.  2.  We  consider  u,(r)  as  functions  from 
the  interval  to  a  Banach  space  X,  t//(t)  :  x  X  —  X,  where  X  =  L^(n),  for  t  =  1,2.  Let  /(u)  : 

(hj-i.  bil  xXxX-*XxXbe  uniformly  Lipschitz  continuous  with  constant  L,  i.e.  ||/(u)  -  f[v)\\xxX  ^ 
L||u-  Vt.  Let  -Ai  =  -V-Cj-V  be  the  Infinitesimal  generator  of  the  Co  semigroup  G,  (r),  r  >  0,  on 

X.  For  simplicity  of  notation,  we  denote  Ai  =  A2  =  A  and  Gi  =  G2  =  G.  Then,  based  on  the  theory  of 
semigroups  [24],  (6)  and  (7)  on  an  interval  \  are  recast  as, 

u|"‘’(r)  =  G(t-t„-i)u'“"-‘’+  f  G(t-s)/i(u‘"'’,u^"'“*’)ds  (58) 

t4"'’(t)  =  G(t-t„_i)t4“''-'’+ f  G(r-s)/2(i/‘,'"’.t4"'V5  (59) 

Jin-l 

Let  M  denote  the  bound  on  ||G(t)||  on  10,T|.  We  then  have. 

Lemma  9.1.  With  Assumptions  A.  I  and  A.2,  the  integral  equation 

i(t)  =  G(t-t„.i}a+ G(t-s)fi(Cfi)ds  (60) 


admits  a  unique  solution  (^,  ^). 

Proof.  The  proof  follows  arguments  used  for  ordinary  differential  equations  [14]  and  employs  tech¬ 
niques  from  [24].  Set  =  a  and  compute 


^‘^■>  =  G(r-t„-i)a4-  G(t-s)M^^j-^\p)ds, 

J‘n-\ 


for  j  =  1,2 . For  7  =  1  we  have, 


i-G(r-r„_i)a|l  =  jjj^  G[t-s)fi{a,p)ds  , 

=  I G{t-s)[fi{a,P)- fi[0,0))ds  ,  (since  fi{0)-0) 


<AtnMLUa,P)\l. 

Moreover,  using  a  semigroup  property  (cf.  page  5  in  [24]), 


t„_i)a-all=  I 
Jo 


G(s)Aads  <  At„M|l  Aa]]. 


Using  the  above  results  and  the  triangle  inequality, 


WCHt)  -  all  <  WrHt)  -  G{t  -  t„-i)a||  +  l|G(r-  f„-i)a-  a|]  <  Ar^MIc,, 
where  Ci  =  ll(a,)6)ll  +L“']|Aa]].  Now  we  use  induction  argument,  where  our  induction  hypothesis  is 

ll^U-i)(f)  _^G-2)(f)||  <  c,(MLAt„)'^-‘'  ( 


25 


Then,  using  the  Lipschitz  continuity  of  /  and  our  induction  hypothesis  (62)  we  have, 

<  CiiML^tn)^ 

Now,  if  MLAt„  <  1,  then  for  I  >  k>  N, 

IK''>(t)-^'*^'(t)||<  £  ||^(y)(f)-^(/-h(f)||  <  (63) 

;=fc+l  1-MLMn 

Thus,  IK*^'(t)  -^*^’(t)||  —  0  as  N  —  oo.  Hence,  ^**'(r)  is  a  Cauchy  sequence  in  the  Banach  space  X,  and 
hence  converges  to  an  element  in  X.  We  pass  to  the  limit  in  (61),  so  that  this  limit  satisfies  (60).  □ 

Now  we  use  this  lemma  to  prove  the  convergence  of  Alg.  2. 

Theorem  9.1.  With  Assumptions  A.  1  and  A.2,  there  exists  t„  >  tn-\  such  that  the  sequence  of  functions 
and  {^2"'’}  as  defined  in  Alg.  2  converges  to  the  exact  solution  of  (2)  on  the  time  interval  /„  = 

[tn-\i  Inh 

Proof  The  existence  of  the  sequences  {uj"'*}  and  are  established  by  repeated  application  of  Lemma  9.1. 
For  m  =  1,  we  set  a  =  Ui(r„_i)  and  f  -  £<2'”(t„_i).  Then,  by  Lemma  9.1,  there  exists  a  solution  (u*,^’,  u^*^') 
to  the  integral  equation  governing  Mj*’.  We  obtain  a  similar  result  for  by  setting  a  =  U2{.tn-\)  and 
f  =  Mj**(t„_i)  .  Hence,  repeated  application  of  this  lemma  shows  the  existence  of  the  sequences 
and  Moreover,  from  the  proof  of  Lemma  9.1  we  have, 

||£4”(f)  -  ££^'”(r)ll  =  ll££^'’(r)  -  H2(r^-i)ll  :£  ciMIAr„ 


Thus, 

,(2)r 


i4"'(£)-££‘/'(£)||<  /  ||G(£-S)(/l(££5"',£4'')-/i(££;",££^‘'))||  +  ||G(£-S)(/i(££;‘\£4'')-/l(££r,£t^"'))l|rf5 


Jtn-\ 


,(2)  „(l) 


(I) 


,(I) 


,(i)  .,(0), 


<ML  f  ||££f’-££”'||£iS  +  CiML(£-£„-l)^ 

Setting  Tn  =  MLAtnexp(MLAtn),  we  apply  Gronwall’s  inequality, 

||££*^’(£)  -  ££*,'* (r) II  <  C\MLit-  tn-\)^  expiMLAtn)  £ 

Similarly, 


1-2 

n 


MLexpiMLAtn) 


|££f>(£)-££‘'>(£)ll< 


CitI 


MLexp{MLAt„) 
Now  we  use  induction,  where  our  induction  hypothesis  is. 


(m-I) 

h 


(£)-££'"'  ^’(£)||  <  C„t"'  ' 


and 


|£4"-‘'(£)-££f-2'(£)||<C„Tr' 


(64) 


(65) 


(66) 


(67) 
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where  c,;  =  MLexp(MLKn '  have,  by  our  induction  hypothesis  and  Gronwall’s  inequality, 


(n-l 


<ML  f  ||u["'>-n‘"'"h||ds+MLAr„||4"’"h(r)-Mt"’-2)(j)|| 

<MLf  ||n‘"'>-u‘'"''"||ds+c„MIAr„T;;’"* 

<  MLAf„c„r"’“'  exp(MLAr„) 


(m-I)  ,,(m-2). 


For  T„  <  1,  and  l>  k>  N, 


|n''>(r)-n|*^>(r)||<  £  ||t/<"'>(r)-n''"-'>(r)|| 

m=k+l 

oo 

^  E  iiwr(G-u‘,"'“”(t)ii 

m-N 


By  enforcing  r„  <  1,  we  get  that  n,"’’  is  a  Cauchy  sequence  that  converges  to  an  element  in  X.  This 
is  also  true  for  We  pass  to  the  limit  in  (58),  so  that  it  converges  to  the  solution  of  the  implicit 
equation.  □ 
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